Montly R&D Status Report Date: December 15, 1993 Title: "SYSTEM-LEVEL DESIGN METHODOLOGY FOR EMBEDDED SIGNAL PROCESSORS" Contract Number: F33615-93-C-1317 Principal Investigator: Edward A. Lee Organization: University of California at Berkeley 1. Tasks Performed This has been a very busy month, with quite a bit to report. Most of the activity has centered around preparation of the 0.5 release of Ptolemy. We are on schedule for this release. Our target date for completion of this release is February of 1994. The critical path is still the conversion of the GUI to use Tcl/Tk. See section 2 for a detailed list of specific accomplishments. The conversion of the documentation from troff to FrameMaker is continuing, with large sections of the User's manual and Programmer's manual completed. We have developed LaTeX templates so that those portions of the documentation that have been previously written for on-line use of the hypertext system texinfo can be incorporated into the manual unchanged. The Cygnus Corporation, of whom we are customer, responded within two days to our "severity: critical" bug report about the Gnu C++ compiler release number 2.5.3, which had a fatal bug in the istrstream destructor of the libg++ library that made it impossible to compile Ptolemy. We received a patch to the library, followed shortly thereafter by a new release of both the compiler and the library. Moreover, Cygnus has agreed to use Ptolemy in their regression tests for new compiler releases. This should make conversion to new versions of the compiler reasonably painless. We will probably be releasing Ptolemy with version 2.5.7 of g++ and version 2.5.2 of libg++. These have just been released (yesterday), so we will implement our "compiler freeze" this week. Stefan De Troch of IMEC, in Belgium, with a little help from us, has adapted the Linker class to work under HP-UX. There remains a problem with the DE domain, but it is now looking very likely that our 0.5 release will support dynamic linking on Hewlett Packard machines. 2. Significant Accomplishments In regard to (3.2), "Specifying and Managing Heterogeneous Designs," we have finalized the plans for a new domain in Ptolemy that will be used to define "Target" objects in a modular way. The semantics of the domain will be similar to Make. It will provide a flexible way to build customized design-flow managers. However, we are postponing the implementation of this domain until after the 0.5 release, since we believe its design to be nontrivial, and we will need more time to test the concept. We have installed a Silage domain in main Ptolemy directory tree. This is a code generation domain that couples to Prof. Rabaey's high-level synthesis tool called Hyper. In regard to (3.3) "Algorithm representation and design", the Matrix Particle class has been fully integrated into the Ptolemy kernel, and a set of approximately 40 matrix operators have been built into the star library. The set of demonstration applications includes a Kalman filter implementation and an implementation of the MUSIC algorithm for detection of sinusoids in noise. As a short term solution to the problem raised in our last report, where we found ourselves unable to use routines from the "Numerical Recipes in C" book, we have identified a number of sources of genuinely public domain software that we can incorporate. Since much of this code is written in Fortran, we have experimented with automatic conversion from Fortran to C, and have found the results satisfactory. We have identified a longer term solution that is much more attractive. An ARPA funded project conducted jointly by the University of Tennessee and Oak Ridge National Laboratory is developing a C++ encapsulation of the LAPACK library. We will be following this project closely, and will consider incorporation of the complete package in future releases of Ptolemy. We have installed a new "communicating processes" (CP) domain in our official Ptolemy directory tree. This domain was developed by Seungjun Lee of the ARPA/INFOPAD project here at Berkeley. It has been used extensively for high-level modeling of a wireless multimedia network. Unfortunately, it has some problematic interaction with the SDF domain that we have not yet been able to solve, so its status with respect to the release is not clear right now. The domain is built on the Sun lightweight process library, and hence can only be supported on the Sun platforms. As a class project, two students in the CS division, Ed Knightly and Anindo Banerjea, have created a much faster version of the discrete-event scheduler in the DE domain. Although the behavior still needs some tuning, the performance is impressive. On simulations with large event lists, the new scheduler can be up to a factor of 1000 faster than the previous scheduler. On practical simulations, the overall speedup is less dramatic, but still significant. We have seen a speedup of up to a factor of 2.5, and have not found any circumstance under which the simulation is slower with the new scheduler. The new scheduler is based on the "calendar queue" mechanism developed by Randy Brown. In regard to (3.5), "synthesis of embedded software and firmware," we have made significant extensions to the ptlang preprocessor that make specification of complicated conditional code generation stars much easier. This work is based on the design of Egbert Ammicht of Bell Labs. It permits free intermixing of code to be generated, and code that controls the generation. For example, the following "go" method generates in-line additions for any number of inputs: go { . $ref(output) = int ni = input.numberPorts(); for (int i = 1; i <= ni; i++) { @ $ref(input#@i) @(i < ni ? " + " : ";") } } When the "@" symbol is the first symbol in a line, that line contains code to be generated, possibly with macro substitution. In this case, the "@i" gets substituted in the generated code with the value of the index "i". The "$ref(input#@i)" is substituted with a reference to the C variable that contains the current token for the i-th input to the star. The expression "@(i < ni ? " + " : ";")" generates a " + " if i < ni, and otherwise terminates the command. The generated code will thus look like (for a three input add): x1 + x2 + x3; This assumes that "xI" is the variable storing the I-th input. The ptlang preprocessor has also been updated to support bidirectional portholes (ports that support input or output). We have also made a number of technical improvements to the base classes that support all code generation, notably in the generation and manipulation of unique symbols. In regard to (3.6) of the SOW, "Interactive, direct-manipulation graphical user interfaces", we have finished systematizing the management of colors and fonts in order to ensure that our GUI operates properly on a variety of platforms, regardless of the colors and fonts that are installed. We are using the options database mechanism in Tk, which allows for (but does not require) customization through X resources. We have upgraded to the newest versions of Tcl and Tk 7.1 and 3.4, respectively. We have reorganized our links to them so that they will be installed by default within the Ptolemy directory tree. We have cleaned up and documented the mechanism for attaching Tcl/Tk scripts to stars. A chapter of the Programmer's manual will be devoted to this. A variety of stars have been written for displaying signals in various ways. We have upgraded to the latest version of Vem (the schematic editor we use), rewritten the makefiles to correspond with the Ptolemy style, and installed it in with the Ptolemy directory tree. Thus, in the 0.5 release, unlike in previous releases, we will be including the source code for the schematic editor. In regard to (3.4),(3.5), and (3.7), we have completed a paper for ICASSP '94 describing three types of interfaces between heterogeneous subsystems. The first of these assumes a mixture of hardware platforms, such as a workstation with a DSP card, and invokes a multiprocessor scheduler that supports heterogeneous partitioning. Code generation is done for each platform. The second mechanism relies on user-guided partitioning, followed by invocation of multiple schedulers, one for each subsystem. This allows for use of schedulers that are optimized for particular platforms. For example, a scheduler that optimizes for memory usage might be used for the DSP component, while a simpler scheduler could be used for the workstation component. The third mechanism interfaces a simulation running on the workstation with synthesized software running on another platform, such as a DSP card. All three mechanisms generalize the "wormhole" interface in Ptolemy. They have been tested on a configuration that consists of a sparcstation 10 with two Ariel DSP56x cards. In regard to (3.8), "Formal Methods", we have completed a paper for ICASSP '94 describing an optimal scheduling technique for chain-structured synchronous dataflow graphs that produces single-appearance schedulers with minimal use of memory for buffering of data. The optimal algorithm is a dynamic programming algorithm with complexity that is order N^3. We have also developed an order N^2 heuristic that works well for a large number of randomly generated test cases. Both techniques have a sizable practical impact for programs that involve non-trivial sample-rate conversions. We have demonstrated this with a test program that converts compact disk recordings (at 44.1 kHz) to DAT recordings (at 48 kHz). 3. Problems Encountered We have had problems with Purify, a program from Pure Inc. that tracks memory allocations, structure overwrites, and memory leaks. The version that we have installed does not work with the newest compiler from the Free Software Foundation. We are working with engineers from Pure Inc. to solve this problem. As extra incentive for them, we have arranged to pay for a departmental site license for their software. 4. Schedule Reconciliation We are on schedule. 5. Next Period Activities The next period will be quite a bit quieter than this one (I hope) since most of the students will be gone for part of the Christmas break. The staff will concentrate on testing and bug fixes during this period. The major coding task remaining before the 0.5 release can be frozen is the update of the user interface for Target specification and editing of Target parameters. The documentation will also be finalized during this time period. We expect four sizable volumes: A User's manual, a Programmer's manual, a Hacker's manual (the title is not decided, but the intent is to provide details to the power user), and a Star Atlas that will provide a comprehensive list of the all the stars and demos in the system. 6. Budget Summary We are on-budget, as near as I can tell. Details will be provided by the University accounting office. 7. Conferences and Trips We were visited by three RASSP participants from the Research Triangle Institute whose principal interest is in transformation of dataflow graphs. Together we developed a pretty clear conception of how we can work together on this. We also connected them with Prof. Jan Rabaey's group, which has done quite a bit of work in graph transformation. 8. Other Comments We welcome feedback on the content and format of this report, as we would like these reports as useful as possible.