The extensibility of Ptolemy can introduce problems. Code that you add may be defective (few people write perfect code every time), or may interact with Ptolemy in unexpected ways. These problems most frequently manifest themselves as a Ptolemy crash, where the Ptolemy kernel aborts, creating a core file.
The fact that pigiRpc
and vem
are separate Unix processes has the advantage that when pigiRpc
aborts with a fatal error, vem
keeps running. Your vem
schematic is unharmed and can be safely saved. Vem gives a cryptic error message something like:
RPC Error: server: application exited without calling RPCExit
Closing Application /home/ohm1/users/messer/ptolemy/lib/pigiRpcShell on host foucault.berkeley.edu
Elapsed time is 1538 seconds
The message
segmentation fault (core dumped)
may appear in the window from which you started pigi
. The first line in the above message might alternatively read
RPC Error: fread of long failed
Vem is trying to tell you that it is unable to get data from the link to the Ptolemy kernel. In either case, it will create a large file in your home directory called
core
. The core1
file is useful for finding the problem.
Assuming you are using Gnu tools, and assuming the pigiRpc
executable that you are using is in your path, go to your home directory and type:
gdb pigiRpc
The Gnu symbolic debugger (gdb
) will show the state of the stack at the point where the program failed. Note that gdb
is not distributed with Ptolemy, but is available free over the Internet in many places, including ftp://prep.ai.mit.edu/pub/gnu
. The most recently called function might give you a clue about the cause of the problem. Here is a typical session:
cxh@watson 197% gdb pigiRpc ~/core
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for
details.
GDB 4.15.1 (sparc-sun-solaris2.4),
Copyright 1995 Free Software Foundation, Inc...
(no debugging symbols found)...
Tell gdb
to read in the core file.
(gdb) core core
Core was generated by \Q/users/ptolemy/bin.sol2/pigiRpc :0.0 watson.eecs.berkeley.edu 32870 inet 1 2 3'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from
/users/ptolemy/lib.sol2/libcg56dspstars.so...done.
Reading symbols from
/users/ptolemy/lib.sol2/libcg56stars.so...done.
Since this version of Ptolemy uses shared libraries, we see lots of messages about shared libraries, which we've deleted here for brevity.
(gdb) where
#0 0xee7a1c20 in _kill ()
#1 0x52b04 in pthread_clear_sighandler ()
#2 0x52cb4 in pthread_clear_sighandler ()
#3 0x53130 in pthread_clear_sighandler ()
#4 0x53320 in pthread_handle_one_process_signal ()
#5 0x55658 in pthread_signal_sched ()
#6 0x554d8 in called_from_sighandler ()
#7 0x535e4 in pthread_handle_pending_signals ()
#8 0x10100c in SimControl::getPollFlag ()
#9 0x101604 in Star::run ()
#10 0xd394c in DataFlowStar::run ()
#11 0xeeca5fb8 in SDFAtomCluster::run (this=0x2bd0b0)
at ../../../../src/domains/sdf/kernel/SDFCluster.cc:1032
#12 0xeeca0f20 in SDFScheduler::runOnce (this=0x2bd050)
at ../../../../src/domains/sdf/kernel/SDFScheduler.cc:121
#13 0xeeca0eac in SDFScheduler::run (this=0x2bd050)
at ../../../../src/domains/sdf/kernel/SDFScheduler.cc:98
#14 0x108358 in Target::run ()
#15 0x109e04 in Runnable::run ()
#16 0xe62ec in InterpUniverse::run ()
#17 0xee9e7f04 in PTcl::run (this=0x20af80, argc=2949528, argv=0x109fa4)
at ../../src/ptcl/PTcl.cc:521
#18 0xee9e99a4 in PTcl::dispatcher (which=0x27, interp=0x1d4830, argc=2,
The "where" command shows that state of the stack at the time of the crash. The actual stack trace was 72 frames long, the last two frames being:
#71 0xeec06d5c in ptkMainLoop ()
at ../../src/pigilib/ptkTkSetup.c:192
#72 0x4982c in main ()
Scanning this list we can recognize that the crash occurred during the execution of a star. Unfortunately, unless you are running a version of pigiRpc
with the debug symbols loaded, it will be difficult to tell much more from this.
To do more extensive debugging, you need to create or find a version of pigiRpc
with debug symbols, called
pigiRpc.debug
.
The first step is to build a pigiRpc
that contains the domains you are interested in debugging. There are several ways to build a pigiRpc
:
- a. There may be prebuilt debug binaries on the Ptolemy Web site, check the directory that contains the latest release.
- b. Rebuild the entire tree from scratch. This takes about 3 hours. Appendix A in the Ptolemy User's Manual has instructions about this.
- c. Use
mkPtolemyTree
to rebuild a subset of the Ptolemy tree. See
"Using mkPtolemyTree to create a custom Ptolemy trees" on page 1-9 for more information.
- d. Use the csh aliases to rebuild a subset of the Ptolemy tree. See
"Using csh aliases to create a Parallel Software Development Tree" on page 1-12 for more information.
The next step is to build the pigiRpc.debug
binary:
cd $PTOLEMY/obj.$PTARCH/pigiRpc; make pigiRpc.debug
Then set the PIGIRPC
environment variable to point to the binary:
setenv PIGIRPC $PTOLEMY/obj.$PTARCH/pigiRpc/pigiRpc.debug2
Then run pigi as follows:
pigi -debug
An extra window running gdb
appears. (If this fails, then gdb
is probably not installed at your site or is not in your path.) Type cont
to continue past the initial breakpoint.
Now, if you can replicate the situation that created the crash, you will be able to get more information about what happened. Here is a sample of interaction with the debugger through the gdb
window:
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for
details.
GDB 4.15.1 (sparc-sun-solaris2.4),
Copyright 1995 Free Software Foundation, Inc...
Breakpoint 1 at 0x39ab4: file ../../src/pigiExample/pigiMain.cc, line 58.
Breakpoint 1, main (argc=-282850408, argv=0x399c0)
at ../../src/pigiExample/pigiMain.cc:58
58 pigiFilename = argv[0];
(gdb) cont
Continuing.
At this point, you are running Ptolemy. Use it in the usual way to replicate your problem. When you succeed, you will get a message something like:
Program received signal SIGSEGV, Segmentation fault.
0xeee81394 in mxRealMax ()
(gdb)
At this point you can again examine the stack. This time, however, there will be more information. Here, we examine the top 5 frames of the stack
(gdb) where 5
#0 0xeee81394 in mxRealMax ()
#1 0xe3864 in SimControl::getPollFlag () at ../../src/kernel/SimControl.cc:271
#2 0xe3e5c in Star::run (this=0x28c908) at ../../src/kernel/Star.cc:73
#3 0xbacb8 in DataFlowStar::run (this=0x28c908)
at ../../src/kernel/DataFlowStar.cc:94
#4 0xef485fb8 in SDFAtomCluster::run (this=0x278570)
at ../../../../src/domains/sdf/kernel/SDFCluster.cc:1032
(More stack frames follow...)
(gdb)
This particular stack trace is a little strange at the "bottom" (gdb calls the lower numbers the bottom even though they are at the top of the list) because it was generated by invoking a dynamically linked star, and the symbol information is not complete. However, you can still find out quite a bit. Notice that you are now told where the files are that define the methods being called. The file names are all relative to the directory in which the corresponding object file normally resides. The Ptolemy files can all be found in some subdirectory of $PTOLEMY/src
.
You can get help from gdb
by typing "help". Suppose you wish to find out first which star is being run when the crash occurs. The following sequence moves up in the stack until the "run" call of a star:
(gdb) up
#1 0xe3864 in SimControl::getPollFlag () at ../../src/kernel/SimControl.cc:271
271 ptBlockSig(SIGALRM);
(gdb) up
#2 0xe3e5c in Star::run (this=0x28c908) at ../../src/kernel/Star.cc:73
73 go();
(gdb)
At this point, you can see that line 73 of the file $PTOLEMY/src/kernel/Star.cc
reads
go();
Odds are pretty good that the problem is in the go()
method of the star. You can find out to which star this method belongs as follows:
(gdb) p *this
$1 = {<Block> = {<NamedObj> = {nm = 0x28ad58 "BadStar1",
prnt = 0x28c878,
myDescriptor = 0x28b658 "Causes a core dump deliberately",
_vptr. = 0xeee91738}, flags = {nElements = 0, val = 0x0},
pTarget = 0x28aa60, scp = 0x0,
ports = {<NamedObjList> = {<SequentialList> =
{lastNode = 0x0, dimen = 0}, }, }, states = {<NamedObjList> =
{<SequentialList> = { lastNode = 0x0, dimen = 0}, }, },
multiports = {<NamedObjList> = {<SequentialList> =
{lastNode = 0x0, dimen = 0}, }, }},
indexValue = -1, inStateFlag = 1}
(gdb)
This tells you that a star with name (nm
) BadStar1
and descriptor "Causes a core dump deliberately." is being invoked. This particular star has the following erroneous go method:
go {
char* p = 0;
*p = 'c';
}
More elaborate debugging requires that the symbols for the star be included. The easiest way to do this is to build a version of pigiRpc.debug
that includes your star already linked into the system. Then repeat the above procedure. The bottom of the stack frame will have much more complete information about what is occurring.
Below are some hints for debugging.
By default,
gdb
is started in an X terminal window with its default command line interface. Many people prefer to interface with gdb
through
emacs
, which provides much more sophisticated interaction between the source code and the debugger. To get an emacs
interface to gdb
(assuming emacs
is installed on your system), set the following environment variable:
setenv
PT_DEBUG ptgdb
To find out more about using gdb
from within emacs
, start up emacs
and type:
M-x info
Then type:
m emacs
Then go down to:
Running Debuggers Under Emacs
* Starting GUD:: How to start a debugger subprocess.
* Debugger Operation:: Connection between the \
debugger and source buffers.
* Commands of GUD:: Key bindings for common commands.
* GUD Customization:: Defining your own commands for GUD.
Note that the documentation for gdb
says the following:
*Warning:* GDB runs your program using the shell indicated by your \QSHELL' environment variable if it exists (or \Q/bin/sh' if not). If your \QSHELL' variable names a shell that runs an initialization file--such as \Q.cshrc' for C-shell, or \Q.bashrc' for BASH--any variables you set in that file affect your program. You may wish to move setting of environment variables to files that are only run when you sign on, such as \Q.login' or \Q.profile'.
By default, Ptolemy is compiled with the optimizer turn up to a very high level. This can result in strange behavior inside the debugger, as the compiler may evaluate instructions in a different order than they appear in the source file. You may find it easier to debug a file by recompiling it with the optimization turned off by removing the corresponding .o
file and doing:
make OPTIMIZER= install
Ptolemy uses StringList
object to manipulate strings. However, using gdb
to view a StringList
object can be non-intuitive. To print the contents of a Stringlist
myStringList
as one item per line from within gdb
, use:
p displayStringListItems(myStringList)
To print out the StringList
as a contiguous string, use:
p displayStringList(myStringList)
If you are spending a lot of time debugging a problem, you may want to use ptcl
instead of pigiRpc
, as ptcl
is smaller and starts up faster. Also, you can keep your breakpoints between invocations of ptcl
, as debugging ptcl
does not start up a separate emacs
each time. However, ptcl
cannot handle demos that use Tk.
Here's how to use ptcl
to debug.
- 1. Run
pigiRpc
on the universe, and use compile-facet to generate a
~/pigiLog.pt
file. Note the number of iterations for the universe, and then exit pigiRpc
.
- 2. Copy
~/pigiLog.pt
to somewhere. A short file name, like /tmp/tst.tcl
will save time in typing since you may be typing it often. Don't use something inside your home directory as you can't easily use ~
inside ptcl
.
- 3. Edit the file and add a
run
XXX
line and a wrapup
line at the end. If the demo should run for 100 iterations, then add:
run 100
wrapup
- to the end of the file.
- 4. Build a
ptcl.debug
that has just exactly the functionality you need by using an override.mk
file. Alternatively, you could use either ptcl.ptrim.debug
or ptcl.ptiny.debug
. If your demo is SDF, then try building and using ptcl.ptiny.debug
.
- 5. If you use
emacs
, then you can start up gdb
on your binary with:
M-x gdb
- 6. Then type in the name of the binary. You may have to use the full pathname.
- 7. Inside
emacs
, you can then set breakpoints in the gdb
window, either by typing a break command, or by viewing the file and typing Control-X
space
at the location you would like a break point.
- 8. Type
r
to start the process, and then source your demo with:
source /tmp/tst.tcl
- If you want to recompile your demo outside of
gdb
and then reload it into your gdb
session, use the file
command inside gdb
:
file /users/cxh/pt/obj.sol2/ptcl/ptcl.ptiny.debug
- Your breakpoints will be saved, which is a big time saver.
If you are having problems debugging with gdb
, here's what to check.
- 1. Verify that your
$PTOLEMY
is set to what you intended. If you are building binaries in your private tree, be sure that $PTOLEMY
is set to your private tree and not ~ptdesign
or /users/ptolemy
.
- 2. Verify that your
$LD_LIBRARY_PATH
does not include libraries in another Ptolemy tree. You could type:
unsetenv $LD_LIBRARY_PATH
- 3.
gdb
sources your .cshrc
, so your $PTOLEMY
and $LD_LIBRARY_PATH
could be different. Inside gdb
, use
show env PTOLEMY
- to see what it is set to. This problem is especially common if you are running
gdb
inside emacs
via ptgdb
.
- 4. Verify that you are running the right binary by looking at the creation times. You may find it useful to use the
-rpc
option:
pigi -debug -rpc $PTOLEMY/obj.$PTARCH/pigiRpc/pigiRpc.mine ~ptdesign/init.pal
- 5. Recompile the problem files with optimization turned off and relink your
pigiRpc
. You can do this with
rm myfile.o; make OPTIMIZER= install
- Then rebuild your
pigiRpc
- 6. Look for weird coding styles that could confuse the line count in emacs and gdb, such as declaring variables in the middle of a block and brackets that open a function body on the same line as the function declaration:
int foo(int bar){
- vs.
int foo(int bar)
{
- 7. Use
stepi
to step by instructions, rather than step
.
1
Note that core files can be large in size, so your system administrator may have setup the csh "limit" command to disable the creation of core files. For further information, see the csh man page.
2
Note that the pigi script will attempt to find pigiRpc.debug binary if the PIGIRPC environment variable is not set. An alternative is that one can avoid setting PIGIRPC and use the pigi -rpc option to specify a binary.The command would be:
pigi -debug -rpc $PTOLEMY/obj.$PTARCH/pigiRpc/pigiRpc.debug
Copyright © 1990-1997, University of California. All rights
reserved.