Design Support for Embedded Processors and Applications

> Prof. Kurt Keutzer EECS University of California Berkeley, CA keutzer@eecs.berkeley.edu







# Solution: ASIC => ASSP => ASIP

**ASIP:** Programmable Platforms

 Develop platforms that allow for amortization of design costs over multiple generations

◆Make platforms *programmable* so that they have maximum flexibility with minimum overhead













# Addressing the problem areas

Modern Embedded Systems Compilers Architectures and Languages

#### **MESCAL** research mission:

▲ To bring a disciplined methodology, and a supporting tool set, to the development, deployment, and programming of application-specific programmable platforms aka application specific instruction processors.

Invited paper: "From ASIC to ASIP: The Next Design Discontinuity", K. Keutzer, S. Malik, R. Newton, Proceedings of ICCD, pp. 84-91, 2002. www.gigascale.org/mescal



### Three Key Problem Areas

- Development of programmable platforms:
  - Characterizing target applications
  - ▲ Design space exploration
- Deployment of programmable platforms:
  - ▲ Development of programming model
  - ▲ Provision of software environment
- Mapping applications onto programmable platforms
  - Application modeling
  - Application mapping



# **Our Approach**

- Bottom-up view create abstractions of existing devices
  - opacity hide micro-architectural details from programmer
  - ▲ visibility sufficient detail of the architecture to allow the programmer to improve the efficiency of the program
- Top down experiment with existing modeling/programming environments
  - Learn from their abstractions of the devices
  - ▲ Try to maximize performance within these environments

# **Our Constraint/Angle/Prejudice**

- In real-time embedded systems correct logical functionality can never be divorced from system performance
- In commercial (especially consumer-oriented) embedded systems system price is an utmost concern
- Quantitative
  - ▲ (Quantitatively) examine trade-offs among:
    - ▼Quality-of-results (e.g. speed, but also power, device cost)

15

▼ Programmer productivity (how long does all this take?)











#### **Our own NPU programming environment: NPClick Based on Click** Popular environment for describing/implementing network applications ▲ Developed by Eddie Kohler, MIT=> ICSI NPClick Implemented subset of element library in IXP uC Element communication via function calls maintained semantics (packet push/pull) ▲ packet storage fixed: header in SRAM payload in DRAM Designer needs to specify: thread boundaries thread/uEngine assignment memory allocation of queues (SRAM, DRAM, Scratch) **Opportunities for optimization (future work)** redundant memory loads/stores based on element/thread mapping schemes for multiplexing hardware resources among multiple element instantiations (e.g. muxing TFIFO among 8 to Device's) 21



# **Productivity Estimates**

- "First time" learning curve issues makes it difficult to compare the productivity of these approaches
- Based on our experience, we estimate the following design times for implementing an IPv4 router

|         | Time to functional correctness | Additional time for<br>performance tuning |
|---------|--------------------------------|-------------------------------------------|
| ASM     | 8 weeks                        | 8 weeks                                   |
| uC      | 4 weeks                        | 6 weeks                                   |
| Teja    | 2 weeks                        | 3-4 weeks                                 |
| NPClick | 2 days                         | 2 weeks                                   |

 The advantages with Teja and NPClick come from the ability to perform design-space exploration at a higher level

# **Conclusions: Programming Embedded Systems**

- Neither ASICs or general-purpose processors will fill the needs of most embedded system applications
- System design teams will increasingly choose ASIPs/programmable platforms
- Programming these devices is a new challenge:
  - A Parallelism
    - Process
    - Operator
    - ▼ Bit/gate level
  - Special-purpose execution units
- Need to develop matches between application development environments and programming models of ASIPs/programmable platforms
- Match must consider:
  - Efficiency
  - Productivity
  - Robustness
  - Reliability

23