*banner
 

Single and Multi-CPU Performance Modeling for Embedded Systems
Trevor Meyerowitz

Citation
Trevor Meyerowitz. "Single and Multi-CPU Performance Modeling for Embedded Systems". PhD thesis, University of California at Berkeley, April, 2008.

Abstract
The combination of increasing design complexity, increasing concurrency, growing heterogeneity, and decreasing time to market windows has caused a crisis for embedded system developers. To deal with this problem, dedicated hardware is being replaced by a growing number of microprocessors in these systems, making software a dominant factor in design time and cost. The use of higher level models for design space exploration and early software development is critical. Much progress has been made on increasing the speed of cycle-level simulators for microprocessors, but they may still be too slow for large scale systems and are too low-level (i.e. they require a detailed implementation) for effective design space exploration. Furthermore, constructing such optimized simulators is a significant task because the particularities of the hardware must be accounted for. For this reason, these simulators are hardly flexible. This thesis focuses on modeling the performance of software executing on embedded processors in the context of a heterogeneous multi-processor system on chip in a more flexible and scalable manner than current approaches. We contend that such systems need to be modeled at a higher level of abstraction and, to ensure accuracy, the higher level must have a connection to lower-levels. First, we describe different levels of abstraction for modeling such systems and how their speed and accuracy relate. Next, the high-level modeling of both individual processing elements and also a bus-based microprocessor system are presented. Finally, an approach for automatically annotating timing information obtained from a cycle-level model back to the original application source code is developed. The annotated source code can then be simulated without the underlying architecture and still maintain good timing accuracy. These methods are driven by execution traces produced by lower level models and were developed for ARM microprocessors and MuSIC, a heterogeneous multiprocessor for Software Defined Radio from Infineon. The annotated source code executed between one to three orders of magnitude faster than equivalent cycle-level models, with good accuracy for most applications tested.

Electronic downloads

Citation formats  
  • HTML
    Trevor Meyerowitz. <a
    href="http://chess.eecs.berkeley.edu/pubs/458.html"
    ><i>Single and Multi-CPU Performance Modeling for
    Embedded Systems</i></a>, PhD thesis, 
    University of California at Berkeley, April, 2008.
  • Plain text
    Trevor Meyerowitz. "Single and Multi-CPU Performance
    Modeling for Embedded Systems". PhD thesis,  University
    of California at Berkeley, April, 2008.
  • BibTeX
    @phdthesis{Meyerowitz08_SingleMultiCPUPerformanceModelingForEmbeddedSystems,
        author = {Trevor Meyerowitz},
        title = {Single and Multi-CPU Performance Modeling for
                  Embedded Systems},
        school = {University of California at Berkeley},
        month = {April},
        year = {2008},
        abstract = {The combination of increasing design complexity,
                  increasing concurrency, growing heterogeneity, and
                  decreasing time to market windows has caused a
                  crisis for embedded system developers. To deal
                  with this problem, dedicated hardware is being
                  replaced by a growing number of microprocessors in
                  these systems, making software a dominant factor
                  in design time and cost. The use of higher level
                  models for design space exploration and early
                  software development is critical. Much progress
                  has been made on increasing the speed of
                  cycle-level simulators for microprocessors, but
                  they may still be too slow for large scale systems
                  and are too low-level (i.e. they require a
                  detailed implementation) for effective design
                  space exploration. Furthermore, constructing such
                  optimized simulators is a significant task because
                  the particularities of the hardware must be
                  accounted for. For this reason, these simulators
                  are hardly flexible. This thesis focuses on
                  modeling the performance of software executing on
                  embedded processors in the context of a
                  heterogeneous multi-processor system on chip in a
                  more flexible and scalable manner than current
                  approaches. We contend that such systems need to
                  be modeled at a higher level of abstraction and,
                  to ensure accuracy, the higher level must have a
                  connection to lower-levels. First, we describe
                  different levels of abstraction for modeling such
                  systems and how their speed and accuracy relate.
                  Next, the high-level modeling of both individual
                  processing elements and also a bus-based
                  microprocessor system are presented. Finally, an
                  approach for automatically annotating timing
                  information obtained from a cycle-level model back
                  to the original application source code is
                  developed. The annotated source code can then be
                  simulated without the underlying architecture and
                  still maintain good timing accuracy. These methods
                  are driven by execution traces produced by lower
                  level models and were developed for ARM
                  microprocessors and MuSIC, a heterogeneous
                  multiprocessor for Software Defined Radio from
                  Infineon. The annotated source code executed
                  between one to three orders of magnitude faster
                  than equivalent cycle-level models, with good
                  accuracy for most applications tested.},
        URL = {http://chess.eecs.berkeley.edu/pubs/458.html}
    }
    

Posted by Trevor Meyerowitz on 23 Jun 2008.
Groups: chess
For additional information, see the Publications FAQ or contact webmaster at chess eecs berkeley edu.

Notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright.

©2002-2018 Chess