From Devpit
Jump to: navigation, search
  • This article was originally written from a 32-bit PowerPC architecture perspective and register information will vary across architectures. 64-bit Power offset information will be added at a later date.

Compile Your Binary with Debugging Symbols

Runtime debugging with GDB is difficult if you don't have the debugging symbols embedded into your program, though not impossible (there are times where adding -g will actually magically make the program work where it was failing before.)

Compile with the -g flag to get debugging symbols embedded in the application binary:

«user@host»:~/dir§ gcc -g test.c

Now GDB, objdump, nm, and all of the other binary investigation tools can gather extended (readable) symbol information from the binary.

A tutorial on how to locate problem points when you can't use the debugging symbols will be covered later (basically compile a version of the library with symbols and note the offsets and compare the offsets in the debug version with the non-symbol version). You have to have a spot-on copy of the source and compile with the same compiler and the same options.

Attaching GDB to a running process

Often when attempting to debug a threading problem you'll get a case where gdb won't catch a hang if you attempt to invoke the program from within GDB. In such a case you'll have to attach gdb to a running program that has hung. To do so find the applications pid using ps -afx then use the following gdb invocation:

«user@host»:~/dir§ gdb
GNU gdb Red Hat Linux (
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "ppc64-redhat-linux-gnu".
(gdb) attach 25912

Manual re-construction of a backtrace

As a series of cascading function calls gets progressively deeper the stack increases in size. Before branching to another function the currently running function saves a return address into the "link register", lr. When the function is called it gets its own stack space and it stores the "link register" into its stack-frame in a place called the "LR Save word". This is the address to which it is supposed to branch when it is ready to return control to its calling function. Each function call does this and this builds a call-stack as each progressivly deeper function is called. A backtrace is the series of return addresses for a stack of function calls.

Sometimes a backtrace gets corrupted in GDB because GDB can't figure out how to construct it properly. This will require a manual backtrace reconstruction. We do this by manually rebuilding the stack, frame by frame. Don't worry, it isn't too bad, just time consuming.

You have to have the ABI handy for your particular architecture to know how the stack-frame is constructed.

  • The x86ManualBacktrace tutorial shows how to manually construct a backtrace on x86 based upon the i386 32-bit ABI
  • This tutorial is based upon the ppc 32-bit ELF ABI.

Corrupted gdb Backtrace

Take the following corrupted backtrace as an example:

(gdb) bt
#0  0x10002b2c in __pthread_sigsuspend (set=0x4) at pt-sigsuspend.c:54
#1  0x10001c2c in __pthread_wait_for_restart_signal (self=0x4)
    at pthread.c:1216
#2  0x100008c8 in pthread_join (thread_id=16386, thread_return=0xffffe070)
    at restart.h:34
#3  0x10000428 in finish () at test.c:35
#4  0x100001ec in __do_global_dtors_aux ()
#5  0x10056c78 in _fini ()
#6  0x10007238 in __libc_csu_fini () at elf-init.c:81
#7  0x10007238 in __libc_csu_fini () at elf-init.c:81
#8  0x10007238 in __libc_csu_fini () at elf-init.c:81
#9  0x10007238 in __libc_csu_fini () at elf-init.c:81
Previous frame inner to this frame (corrupt stack?)

Notice how frames #6,#7,#8 and #9 have the same Program Counter (0x10007238)? This indicates that got confused at that point. This may or may not indicate real stack corruption. For this we'll have to investigate the stack. Sometimes the corruption can be quite extensive.

  • The following how-to will show how to manually reconstruct the back-trace.

Register relevance to backtraces

A good place to start investigation is with the current state of the registers:

(gdb) info reg
r0             0xb2     178
r1             0xffffdeb0       4294958768
r2             0x100c5400       269243392
r3             0x4      4
r4             0x8      8
r5             0xffffdef0       4294958832
r6             0x8      8
r7             0x10080000       268959744
r8             0xffffffc0       4294967232
r9             0x0      0
r10            0x0      0
r11            0x1f     31
r12            0x44000428       1140851752
r13            0x10082814       268970004
r14            0x0      0
r15            0x0      0
r16            0x0      0
r17            0x0      0
r18            0x0      0
r19            0x0      0
r20            0x10080000       268959744
r21            0x1000032c       268436268
r22            0x10007254       268464724
r23            0x100071d4       268464596
r24            0x0      0
r25            0x1      1
r26            0xffffe334       4294959924
r27            0xffffe070       4294959216
r28            0x0      0
r29            0x4002   16386
r30            0x10080000       268959744
r31            0xffffdef0       4294958832
pc             0x10002b2c       268446508
cr             0x34000428       872416296
lr             0x10001c2c       268442668
ctr            0x100071d4       268464596
xer            0x20000000       536870912

On Power hardware there are three really important registers that'll help us rebuild the stack and capture our back-trace. These are "general purpose register 1", i.e. r1, the "program counter register", i.e. pc and the "link register", i.e lr. The "count register", i.e. ctr can be useful as well, due to the fact that it is often used for branching to function pointers.

  1. The "program counter" register pc is denoted by the ppc32 elf abi as the register that holds the pointer to the current instruction (or instruction that a hang or crash waits on).
  2. The ppc32 elf abi denotes the "Link Register, lr as a register that is volatile across each function call. A function will update the lr with the address that a yet to-be-called function should return to when it is done. There is no stricture on when a function, preparing to branch, should fill in the lr. The lr isn't the most reliable indicator because the program could crash after it set the lr but before it branched to the next function or it could have crashed before it set the lr, prior to a branch. The abi indicates that a called function must save the lr into the stack frame's LR save location immediately after it is invoked so that it knows which function it is supposed to branch back to when it returns.
  3. Per the ppc32 elf abi, "general purpose register 1", 'r1', always holds the current stack-frame pointer and it is always valid. The contents of the stack-frame pointer (first word) is always the BC (Back Chain) pointer to the previously allocated Stack-Frame. In ppc32 the second word at the stack-frame pointer address (stack-frame pointer + 0x4) is always the LR Save area. It is where the address found in the Link Register when the function is entered is required to be stored.

We know that the last instruction is stored in the pc, 0x10002b2c, so we can keep that in mind when we rebuild our backtrace.

Examining the Stack-Frame in memory

Next, we'll begin to reconstruct the backtrace by locating all of the stack-frame pointers. Lets take a look at the memory comprising the stack by investigating the first stack-frame pointer, pointed to by r1. I cheat and exclude parts of the stack that are irrelevant.

(gdb) x /200w 0xffffdeb0
0xffffdeb0:     0xffffdf50      0x10001c1c      0x00000000      0x00000000
0xffffdec0:     0x00000000      0x00000000      0x00000000      0x00000000
0xffffded0:     0x00000000      0x00000000      0x00000000      0x00000000
0xffffdee0:     0x00000000      0x00000000      0x00000000      0x00000000
0xffffdef0:     0xffffdf00      0x00000000      0x100e0000      0x00000000
0xffffdf00:     0xffffdf20      0x00000000      0x00000000      0x00000000
0xffffdf10:     0x00000000      0x00000000      0x000003e0      0x1007d29c
0xffffdf20:     0xffffdf40      0x00000000      0x00000000      0x00000000
0xffffdf30:     0xffffdf50      0x00000000      0x00000000      0xffffdf50
0xffffdf40:     0xffffdf50      0x00004002      0x100c1ba0      0x1007dc64
0xffffdf50:     0xffffe030      0x100008c8      0x100be000      0xffffdf70
0xffffdf60:     0xffffe000      0x10001c1c      0x00000000      0x00000000
0xffffdf70:     0x00000000      0x00000000      0x00000000      0x00000000
0xffffdf80:     0x00000000      0x00000000      0x00000000      0x00000000
0xffffdf90:     0x00000000      0x00000000      0x00000000      0x00000000
0xffffdfa0:     0x00000000      0x00000000      0x00000000      0x00000000
0xffffdfb0:     0x00000003      0x00000004      0x100bec40      0x100bec28
0xffffdfc0:     0xffffdfd0      0x00000001      0x1007db44      0x100be000
0xffffdfd0:     0xffffe000      0x10004dd0      0x00003362      0x44000422
0xffffdfe0:     0x00000000      0x00003362      0x00000000      0x80000000
0xffffdff0:     0x100be000      0x00000000      0x100be000      0x00000094
0xffffe000:     0x1007dc64      0x1000045c      0x00000000      0x100071d4
0xffffe010:     0x100be000      0x00000002      0x00000000      0x00000000
0xffffe020:     0x00000000      0x00000000      0x10080000      0xffffe030
0xffffe030:     0xffffe060      0x10000428      0x00000000      0x1007d29c
0xffffe040:     0xffffe070      0x10012e14      0x00000000      0x00000000
0xffffe050:     0x00000023      0x00000000      0x00000000      0x10080000
0xffffe060:     0xffffe080      0x100001ec      0xffffe304      0x00000000
0xffffe070:     0x00000000      0xffffe418      0x00000000      0xffffffff
0xffffe080:     0xffffe0a0      0x10056c78      0x00000000      0x1007d094
0xffffe090:     0xffffe0a0      0x100066d8      0xffffe304      0x00000000
0xffffe0a0:     0xffffe0c0      0x10007238      0x00000000      0xffffe0b0
0xffffe0b0:     0x00000000      0x00000000      0x00000000      0x10080000
0xffffe0c0:     0xffffe0e0      0x100080c8      0x00000000      0xffffe1b2
0xffffe0d0:     0xffffe0e0      0xffffe418      0x00000000      0x10080000
0xffffe0e0:     0xffffe2f0      0x10006d80      0x00000000      0x00000000
0xffffe0f0:     0x00000000      0x00000000      0x00000000      0x00000000
0xffffe250:     0x00000000      0x00000000      0x00000000      0x00000000
  • In the output above 0xffffdeb0, taken from r1, is the current (as of the hang or crash) stack-frame pointer, which coincides with the instruction in the pc register. The value at the stack-frame pointer is the address of the back-chain pointer to the previous stack frame. So address 0xffffdf50 is the address of the previous stack frame.
  • As mentionted earlier, the second word of a stack frame is the LR Save Word. So address 0xffffdeb0 is the LR Save Word for the current function, which is the instruction address to which we blr (branch to link register) when this function returns to its calling function. We use this address to determine which function the current function was called from.

Gathering the saved Program Counters

We can compose the following stack-frame table by following the backchain pointer and recording the LR Save Word for each stack-frame, e.g.

  • NOTE: When the backchain pointer for a stack-frame is 0x00000000 we know we've reached the start of the program.
 stack frame ptr   backchain ptr     LR save word
    0xffffdeb0:     0xffffdf50      0x10001c1c
    0xffffdf50:     0xffffe030      0x100008c8
    0xffffe030:     0xffffe060      0x10000428
    0xffffe060:     0xffffe080      0x100001ec
    0xffffe080:     0xffffe0a0      0x10056c78
    0xffffe0a0:     0xffffe0c0      0x10007238
    0xffffe0c0:     0xffffe0e0      0x100080c8
    0xffffe0e0:     0xffffe2f0      0x10006d80
    0xffffe250:     0x00000000      0x00000000

Use objdump to attach symbols to addresses

The next thing to do is to use another terminal to objdump the disassembly and symbol information from the binary so that we can see which functions the instruction pointers stored in the LR Save Words reside in.

«user@host»:~/dir§ objdump -tD a.out > a.dis

For the backtrace we're really only interested in the program counters (the values in LR save word) so we construct a backtrace table using the value in the pc as the first address:

#0  0x10002b2c
#1  0x10001c1c
#2  0x100008c8
#3  0x10000428
#4  0x100001ec
#5  0x10056c78
#6  0x10007238
#7  0x100080c8
#8  0x10006d80

Investigate objdump disassembly for program counters

Now, start looking up the addresses in the disassembly file. Remember, if you don't see symbol names you either didn't build with the -g option or you didn't ask objdump for the symbol information.

So looking for address 0x10002b2c gives the following, such that we know that 0x10002b2c resides in function __pthread_sigsuspend:

10002b20 <__pthread_sigsuspend>:
10002b20:       38 00 00 b2     li      r0,178
10002b24:       38 80 00 08     li      r4,8
10002b28:       44 00 00 02     sc
10002b2c:       7c 00 00 26     mfcr    r0
10002b30:       4e 80 00 20     blr

We'll do one more example since the first instruction address is usually a bit different than the rest because it is usually the instruction that caused the crash, and not a function return address like the remainder of the address pointers will be. Look at the next instruction in the list, 0x10001c1c.

10001be4 <__pthread_wait_for_restart_signal>:
10001be4:       94 21 ff 60     stwu    r1,-160(r1)
10001be8:       7c 08 02 a6     mflr    r0
10001bec:       93 e1 00 9c     stw     r31,156(r1)
10001bf0:       3b e1 00 10     addi    r31,r1,16
10001bf4:       93 c1 00 98     stw     r30,152(r1)
10001bf8:       38 80 00 00     li      r4,0
10001bfc:       7f e5 fb 78     mr      r5,r31
10001c00:       38 60 00 02     li      r3,2
10001c04:       3f c0 10 08     lis     r30,4104
10001c08:       90 01 00 a4     stw     r0,164(r1)
10001c0c:       48 00 60 e5     bl      10007cf0 <__sigprocmask>
10001c10:       80 9e a8 24     lwz     r4,-22492(r30)
10001c14:       7f e3 fb 78     mr      r3,r31
10001c18:       48 00 63 0d     bl      10007f24 <sigdelset>
10001c1c:       38 00 00 00     li      r0,0
10001c20:       90 02 8c 24     stw     r0,-29660(r2)
10001c24:       7f e3 fb 78     mr      r3,r31
10001c28:       48 00 0e f9     bl      10002b20 <__pthread_sigsuspend>
10001c2c:       81 3e a8 24     lwz     r9,-22492(r30)
10001c30:       80 02 8c 24     lwz     r0,-29660(r2)
10001c34:       7f 80 48 00     cmpw    cr7,r0,r9
10001c38:       40 9e ff ec     bne+    cr7,10001c24 <__pthread_wait_for_restart_signal+0x40>
10001c3c:       7c 00 04 ac     sync
10001c40:       80 01 00 a4     lwz     r0,164(r1)
10001c44:       83 c1 00 98     lwz     r30,152(r1)
10001c48:       83 e1 00 9c     lwz     r31,156(r1)
10001c4c:       7c 08 03 a6     mtlr    r0
10001c50:       38 21 00 a0     addi    r1,r1,160
10001c54:       4e 80 00 20     blr

This one is interesting because we know we are in __pthread_wait_for_restart_signal but the LR is before the __pthread_sigsuspend function call which is the call we just made. This is because __pthread_wait_for_restart_signal probably includes a loop in the code and the compiler decided to have the called function immediately execute again. Continue to trace each instruction address in our backtrace and rebuild it until you get the following:

Apply symbols to rebuilt backtrace

#0  0x10002b2c  in __pthread_sigsuspend
#1  0x10001c1c  in __pthread_wait_for_restart_signal
#2  0x100008c8  in pthread_join
#3  0x10000428  in finish
#4  0x100001ec  in __do_global_dtors_aux
#5  0x10056c78  in _fini
#6  0x10007238  in __libc_csu_fini
#7  0x100080c8  in exit
#8  0x10006d80  in __libc_start_main

Congratulations, you've successfully reconstructed a backtrace.

The End


  • The original content of this tutorial was provided by Ryan S. Arnold, aka RandomTask, from his engineering journal.