Top Up Prev Next Bottom Contents Index Search

1.7 Debugging Ptolemy and Extensions Within Pigi


The extensibility of Ptolemy can introduce problems. Code that you add may be defective (few people write perfect code every time), or may interact with Ptolemy in unexpected ways. These problems most frequently manifest themselves as a Ptolemy crash, where the Ptolemy kernel aborts, creating a core file.

The fact that pigiRpc and vem are separate Unix processes has the advantage that when pigiRpc aborts with a fatal error, vem keeps running. Your vem schematic is unharmed and can be safely saved. Vem gives a cryptic error message something like:

RPC Error: server: application exited without calling RPCExit
Closing Application /home/ohm1/users/messer/ptolemy/lib/pigiRpcShell on host foucault.berkeley.edu
Elapsed time is 1538 seconds
The message

segmentation fault (core dumped) may appear in the window from which you started pigi. The first line in the above message might alternatively read

RPC Error: fread of long failed Vem is trying to tell you that it is unable to get data from the link to the Ptolemy kernel. In either case, it will create a large file in your home directory called core. The core1 file is useful for finding the problem.

1.7.1 A quick scan of the stack

Assuming you are using Gnu tools, and assuming the pigiRpc executable that you are using is in your path, go to your home directory and type:

gdb pigiRpc The Gnu symbolic debugger (gdb) will show the state of the stack at the point where the program failed. Note that gdb is not distributed with Ptolemy, but is available free over the Internet in many places, including ftp://prep.ai.mit.edu/pub/gnu. The most recently called function might give you a clue about the cause of the problem. Here is a typical session:

cxh@watson 197% gdb pigiRpc ~/core
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for
details.
GDB 4.15.1 (sparc-sun-solaris2.4),
Copyright 1995 Free Software Foundation, Inc...
(no debugging symbols found)...
Tell gdb to read in the core file.

(gdb) core core
Core was generated by \Q/users/ptolemy/bin.sol2/pigiRpc :0.0 watson.eecs.berkeley.edu 32870 inet 1 2 3'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from
/users/ptolemy/lib.sol2/libcg56dspstars.so...done.
Reading symbols from
/users/ptolemy/lib.sol2/libcg56stars.so...done.
Since this version of Ptolemy uses shared libraries, we see lots of messages about shared libraries, which we've deleted here for brevity.

(gdb) where
#0 0xee7a1c20 in _kill ()
#1 0x52b04 in pthread_clear_sighandler ()
#2 0x52cb4 in pthread_clear_sighandler ()
#3 0x53130 in pthread_clear_sighandler ()
#4 0x53320 in pthread_handle_one_process_signal ()
#5 0x55658 in pthread_signal_sched ()
#6 0x554d8 in called_from_sighandler ()
#7 0x535e4 in pthread_handle_pending_signals ()
#8 0x10100c in SimControl::getPollFlag ()
#9 0x101604 in Star::run ()
#10 0xd394c in DataFlowStar::run ()
#11 0xeeca5fb8 in SDFAtomCluster::run (this=0x2bd0b0)
at ../../../../src/domains/sdf/kernel/SDFCluster.cc:1032
#12 0xeeca0f20 in SDFScheduler::runOnce (this=0x2bd050)
at ../../../../src/domains/sdf/kernel/SDFScheduler.cc:121
#13 0xeeca0eac in SDFScheduler::run (this=0x2bd050)
at ../../../../src/domains/sdf/kernel/SDFScheduler.cc:98
#14 0x108358 in Target::run ()
#15 0x109e04 in Runnable::run ()
#16 0xe62ec in InterpUniverse::run ()
#17 0xee9e7f04 in PTcl::run (this=0x20af80, argc=2949528, argv=0x109fa4)
at ../../src/ptcl/PTcl.cc:521
#18 0xee9e99a4 in PTcl::dispatcher (which=0x27, interp=0x1d4830, argc=2,
The "where" command shows that state of the stack at the time of the crash. The actual stack trace was 72 frames long, the last two frames being:

#71 0xeec06d5c in ptkMainLoop ()
at ../../src/pigilib/ptkTkSetup.c:192
#72 0x4982c in main ()
Scanning this list we can recognize that the crash occurred during the execution of a star. Unfortunately, unless you are running a version of pigiRpc with the debug symbols loaded, it will be difficult to tell much more from this.

1.7.2 More extensive debugging

To do more extensive debugging, you need to create or find a version of pigiRpc with debug symbols, called pigiRpc.debug.

The first step is to build a pigiRpc that contains the domains you are interested in debugging. There are several ways to build a pigiRpc:

a. There may be prebuilt debug binaries on the Ptolemy Web site, check the directory that contains the latest release.
b. Rebuild the entire tree from scratch. This takes about 3 hours. Appendix A in the Ptolemy User's Manual has instructions about this.
c. Use mkPtolemyTree to rebuild a subset of the Ptolemy tree. See "Using mkPtolemyTree to create a custom Ptolemy trees" on page 1-9 for more information.
d. Use the csh aliases to rebuild a subset of the Ptolemy tree. See "Using csh aliases to create a Parallel Software Development Tree" on page 1-12 for more information. The next step is to build the pigiRpc.debug binary:

cd $PTOLEMY/obj.$PTARCH/pigiRpc; make pigiRpc.debug Then set the PIGIRPC environment variable to point to the binary:

setenv PIGIRPC $PTOLEMY/obj.$PTARCH/pigiRpc/pigiRpc.debug2 Then run pigi as follows:

pigi -debug An extra window running gdb appears. (If this fails, then gdb is probably not installed at your site or is not in your path.) Type cont to continue past the initial breakpoint.

Now, if you can replicate the situation that created the crash, you will be able to get more information about what happened. Here is a sample of interaction with the debugger through the gdb window:

GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for
details.
GDB 4.15.1 (sparc-sun-solaris2.4),
Copyright 1995 Free Software Foundation, Inc...
Breakpoint 1 at 0x39ab4: file ../../src/pigiExample/pigiMain.cc, line 58.
Breakpoint 1, main (argc=-282850408, argv=0x399c0)
at ../../src/pigiExample/pigiMain.cc:58
58 pigiFilename = argv[0];
(gdb) cont
Continuing.
At this point, you are running Ptolemy. Use it in the usual way to replicate your problem. When you succeed, you will get a message something like:

Program received signal SIGSEGV, Segmentation fault.
0xeee81394 in mxRealMax ()
(gdb)
At this point you can again examine the stack. This time, however, there will be more information. Here, we examine the top 5 frames of the stack

(gdb) where 5
#0 0xeee81394 in mxRealMax ()
#1 0xe3864 in SimControl::getPollFlag () at ../../src/kernel/SimControl.cc:271
#2 0xe3e5c in Star::run (this=0x28c908) at ../../src/kernel/Star.cc:73
#3 0xbacb8 in DataFlowStar::run (this=0x28c908)
at ../../src/kernel/DataFlowStar.cc:94
#4 0xef485fb8 in SDFAtomCluster::run (this=0x278570)
at ../../../../src/domains/sdf/kernel/SDFCluster.cc:1032
(More stack frames follow...)
(gdb)
This particular stack trace is a little strange at the "bottom" (gdb calls the lower numbers the bottom even though they are at the top of the list) because it was generated by invoking a dynamically linked star, and the symbol information is not complete. However, you can still find out quite a bit. Notice that you are now told where the files are that define the methods being called. The file names are all relative to the directory in which the corresponding object file normally resides. The Ptolemy files can all be found in some subdirectory of $PTOLEMY/src.

You can get help from gdb by typing "help". Suppose you wish to find out first which star is being run when the crash occurs. The following sequence moves up in the stack until the "run" call of a star:

(gdb) up
#1 0xe3864 in SimControl::getPollFlag () at ../../src/kernel/SimControl.cc:271
271 ptBlockSig(SIGALRM);
(gdb) up
#2 0xe3e5c in Star::run (this=0x28c908) at ../../src/kernel/Star.cc:73
73 go();
(gdb)
At this point, you can see that line 73 of the file $PTOLEMY/src/kernel/Star.cc reads


go();
Odds are pretty good that the problem is in the go() method of the star. You can find out to which star this method belongs as follows:

(gdb) p *this
$1 = {<Block> = {<NamedObj> = {nm = 0x28ad58 "BadStar1",
prnt = 0x28c878,
myDescriptor = 0x28b658 "Causes a core dump deliberately",
_vptr. = 0xeee91738}, flags = {nElements = 0, val = 0x0},
pTarget = 0x28aa60, scp = 0x0,
ports = {<NamedObjList> = {<SequentialList> =
{lastNode = 0x0, dimen = 0}, }, }, states = {<NamedObjList> =
{<SequentialList> = { lastNode = 0x0, dimen = 0}, }, },
multiports = {<NamedObjList> = {<SequentialList> =
{lastNode = 0x0, dimen = 0}, }, }},
indexValue = -1, inStateFlag = 1}
(gdb)
This tells you that a star with name (nm) BadStar1 and descriptor "Causes a core dump deliberately." is being invoked. This particular star has the following erroneous go method:

go {
char* p = 0;
*p = 'c';
}
More elaborate debugging requires that the symbols for the star be included. The easiest way to do this is to build a version of pigiRpc.debug that includes your star already linked into the system. Then repeat the above procedure. The bottom of the stack frame will have much more complete information about what is occurring.

1.7.3 Debugging hints

Below are some hints for debugging.

Using emacs, gdb and pigi

By default, gdb is started in an X terminal window with its default command line interface. Many people prefer to interface with gdb through emacs, which provides much more sophisticated interaction between the source code and the debugger. To get an emacs interface to gdb (assuming emacs is installed on your system), set the following environment variable:

setenv PT_DEBUG ptgdb To find out more about using gdb from within emacs, start up emacs and type:

M-x info
Then type:
m emacs
Then go down to:


Running Debuggers Under Emacs

* Starting GUD:: How to start a debugger subprocess.
* Debugger Operation:: Connection between the \
debugger and source buffers.
* Commands of GUD:: Key bindings for common commands.
* GUD Customization:: Defining your own commands for GUD.

Gdb and the environment

Note that the documentation for gdb says the following:

*Warning:* GDB runs your program using the shell indicated by your \QSHELL' environment variable if it exists (or \Q/bin/sh' if not). If your \QSHELL' variable names a shell that runs an initialization file--such as \Q.cshrc' for C-shell, or \Q.bashrc' for BASH--any variables you set in that file affect your program. You may wish to move setting of environment variables to files that are only run when you sign on, such as \Q.login' or \Q.profile'.

Optimization

By default, Ptolemy is compiled with the optimizer turn up to a very high level. This can result in strange behavior inside the debugger, as the compiler may evaluate instructions in a different order than they appear in the source file. You may find it easier to debug a file by recompiling it with the optimization turned off by removing the corresponding .o file and doing:

make OPTIMIZER= install

Debugging StringLists in gdb

Ptolemy uses StringList object to manipulate strings. However, using gdb to view a StringList object can be non-intuitive. To print the contents of a Stringlist myStringList as one item per line from within gdb, use:

p displayStringListItems(myStringList) To print out the StringList as a contiguous string, use:

p displayStringList(myStringList)

How to use ptcl to speed up the compile/test cycle.

If you are spending a lot of time debugging a problem, you may want to use ptcl instead of pigiRpc, as ptcl is smaller and starts up faster. Also, you can keep your breakpoints between invocations of ptcl, as debugging ptcl does not start up a separate emacs each time. However, ptcl cannot handle demos that use Tk.

Here's how to use ptcl to debug.

1. Run pigiRpc on the universe, and use compile-facet to generate a
~/pigiLog.pt file. Note the number of iterations for the universe, and then exit pigiRpc.
2. Copy ~/pigiLog.pt to somewhere. A short file name, like /tmp/tst.tcl will save time in typing since you may be typing it often. Don't use something inside your home directory as you can't easily use ~ inside ptcl.
3. Edit the file and add a run XXX line and a wrapup line at the end. If the demo should run for 100 iterations, then add:
run 100
wrapup
to the end of the file.
4. Build a ptcl.debug that has just exactly the functionality you need by using an override.mk file. Alternatively, you could use either ptcl.ptrim.debug or ptcl.ptiny.debug. If your demo is SDF, then try building and using ptcl.ptiny.debug.
5. If you use emacs, then you can start up gdb on your binary with:
M-x gdb
6. Then type in the name of the binary. You may have to use the full pathname.
7. Inside emacs, you can then set breakpoints in the gdb window, either by typing a break command, or by viewing the file and typing Control-X space at the location you would like a break point.
8. Type r to start the process, and then source your demo with:
source /tmp/tst.tcl
If you want to recompile your demo outside of gdb and then reload it into your gdb session, use the file command inside gdb:
file /users/cxh/pt/obj.sol2/ptcl/ptcl.ptiny.debug
Your breakpoints will be saved, which is a big time saver.

Miscellaneous debugging hints for gdb

If you are having problems debugging with gdb, here's what to check.

1. Verify that your $PTOLEMY is set to what you intended. If you are building binaries in your private tree, be sure that $PTOLEMY is set to your private tree and not ~ptdesign or /users/ptolemy.
2. Verify that your $LD_LIBRARY_PATH does not include libraries in another Ptolemy tree. You could type:
unsetenv $LD_LIBRARY_PATH
3. gdb sources your .cshrc, so your $PTOLEMY and $LD_LIBRARY_PATH could be different. Inside gdb, use
show env PTOLEMY
to see what it is set to. This problem is especially common if you are running gdb inside emacs via ptgdb.
4. Verify that you are running the right binary by looking at the creation times. You may find it useful to use the -rpc option:
pigi -debug -rpc $PTOLEMY/obj.$PTARCH/pigiRpc/pigiRpc.mine ~ptdesign/init.pal
5. Recompile the problem files with optimization turned off and relink your pigiRpc. You can do this with
rm myfile.o; make OPTIMIZER= install
Then rebuild your pigiRpc
6. Look for weird coding styles that could confuse the line count in emacs and gdb, such as declaring variables in the middle of a block and brackets that open a function body on the same line as the function declaration:
int foo(int bar){
vs.
int foo(int bar) {
7. Use stepi to step by instructions, rather than step.


Top Up Prev Next Bottom Contents Index Search

1 Note that core files can be large in size, so your system administrator may have setup the csh "limit" command to disable the creation of core files. For further information, see the csh man page.

2 Note that the pigi script will attempt to find pigiRpc.debug binary if the PIGIRPC environment variable is not set. An alternative is that one can avoid setting PIGIRPC and use the pigi -rpc option to specify a binary.The command would be:
pigi -debug -rpc $PTOLEMY/obj.$PTARCH/pigiRpc/pigiRpc.debug

Copyright © 1990-1997, University of California. All rights reserved.