Debugging on OS/2
Previously I wrote some notes about
basic debugging issues.
Here I want to briefly mention some OS/2 and EMX/gcc specifics.
Peculiarities of debugging PM apps are not covered here.
Tools and Helpers
For good reasons I only refer to
"no cost" products
here ...
-
There are some debuggers available for free:
-
Resolving segmentation faults (
SIGSEGV):
Watch out for debugging implementations of malloc() like
dmalloc.
-
dbmalloc
-
is available from
OS/2 software sites.
It's rather old now (>= 7 years), but still builds and the
supplied test examples work.
If you're going for a quick check it is sufficient just to
link against this library:
gcc -Zmt -Zexe -o foo foo.o -lbar -ldbmalloc
But often you will prefer to get a more helpful output
than the one supplied in this case and add
#include <dbmalloc.h>
to all your source files (actually you don't need to do so
for all, but it is a good idea ...) and rebuild.
-
dmalloc
-
I'm offering a
build for OS/2 EMX. Usage is quite
similar to
dbamlloc.
-
libefence
-
a well-known one on un*x, isn't available on OS/2.
-
ccmalloc
-
nice one for i86 linux, isn't available on OS/2.
-
Operating System/2 API Trace
(os2trace.zip on OS/2 software sites):
Enables, customizes, controls and summarizes the tracing of
OS/2 APIs imported by a 16-bit or 32-bit executable file
without affecting its source code or requiring recompiling or relinking.
Debugging Basics
Though called "basic" the following methods may not be obvious
even if you already have some experience with debugging in general.
-
To set a
hard-coded breakpoint
(see
SIGTRAP)
within your application you may use this macro:
#define BREAKPOINT __asm__("int3");
-
core dumps are images of the current process
(mainly the memory) written to the disk. On EMX the're only
available if using a.out objects, and so could
afterwards be debugged using
GDB. Core dumps
are even available upon request, see _core().
-
Don't mix OS/2' and EMX' memory handling!
-
Breakpoint
on
exit():
If an executable is linked against emxlibcm.dll (or
emxlibcs.dll; I will use the term "emxlibc*.dll"
in the following)
exit() and other libc functions (also those
related to exit() like abort() and
_exit()) are not known symbols to gdb.
Unfortunately this usually happens when building X11 apps (see
XFree86 OS/2 FAQ).
exit() is located in emxlibc*.dll,
which was linked from omf objects using LINK386.
gdb is unable to resolve symbols by name
from DLLs of these kind. Therefore it doesn't know the address of
exit() and cannot set a breakpoint on it. Look up its offset
in \emx\etc\emxlibc*.map, e.g.
0001:0000DB78 exit exit
Use set show-dlls to make gdb
stop upon accessing the DLL for the first time and set
a specific breakpoint on that specific DLL by using
dll-break emxlibcm. From that output, e.g.
[Load DLL: E:\PROGRAM\EMX\DLL\EMXLIBCM.DLL]
[.text: 0x1db80000 - 0x1dba9a80]
[.data: 0x187b0000 - 0x187b6060]
[.bss: 0x187b6060 - 0x187b97c0]
extract the .text (i.e. executable code) base address.
Then set the breakpoint on the address calculated as the sum
of base and offset:
b *(0x1db80000+0xdb78)
The EMX docs claim that "due to a bug in OS/2 the breakpoint will apply
to all programs using emxlibc*.dll".
I couldn't verify this with recent versions of OS/2 ...
-
Resolving X Errors
See
debugging.html for some
general comments on this issue.
Since setting breakpoints in the X11 libs instead of
libc (emxlibc*.dll) requires to link a set of debuggable
X11 libraries statically and
so doesn't help much with XFree86 OS/2/EMX usually.
Misc Hints
This is intended to be a collection of ideas if you run out
of them while trying to resolve a problem.
Here I could really need more input from other developers!
-
Subtle segmentation faults are sometimes triggered by running
out of stack space.
An endless recursion is a good candidate for this.
Or the stack size was given a too small value
(see -Zstack option of gcc). Since this can not be fixed
on OS/2 during runtime (but only while linking the application) just make
it big enough for any possible situation. OTOH setting it too big
may end up with more subtle errors, like sys1059 when
trying to start an application ...
-
Another candidate is usage of stale pointers: references to
memory/variables of storage class
auto. Hard to debug,
since few tools will tell you when trying to free a reference to
such a pointer.
-
If some problems happen with input being read (binary data like images,
or text files as well) check whether the file (socket, pipe) is being read
in the correct mode (see
fopen(), -Zbin-files).
You may have to write explicit code to read in text files which may be
in either
DOS or un*x format.
-
Make sure a program which accesses DLLs (via import libs or dynamically)
loads the correct versions of those libraries. You may use
ldd foo.exe to see the shared libs of that executable
(check the porting FAQ for that command).
-
Applications which depend on signals (including usage as a timer, for
data acquisition, animations) and which don't work properly might suffer
from using the wrong signal model (see
Porting FAQ).
-
Sometimes one forgets that
fork() doesn't work in
an executable based on omf-objects/linked with
link386.
-
If you believe to have discovered a bug in your current EMX/gcc
try the
alternative versions.
Other Resources
Here I collect references to other information resources which
have not been mentioned so far.
-
IBM OS/2 Debugging handbook (INF format)
-
More info about traps, the native OS/2 error mechanism,
can be found in
-
except3.zip
Contains sample code using exception handling for debugging purposes.