Advice on debuging the fork syscall hanging forever

From: Miguel Freitas
Date: Wed Apr 25 2007 - 10:21:47 EST


hi,

making a long story short: i'm trying to debug a problem with the fork
syscall as called from glibc's popen() function. it seems that a given
process (in case Xorg) get stuck inside a syscall that never returns.
while stuck it keeps using 100% of cpu.

I can confirm the problem with kernel 2.6.18.8 / x86_64 / SMP (my
system) but i can't yet tell who is to blame here (Xorg, glibc or the
kernel itself?). it has also been reported with kernel 2.6.17 and
IA32.

the full bug report for Xorg is available at:

https://bugs.freedesktop.org/show_bug.cgi?id=10525

below i will quote only the relevant gdb session (comment #16) so
kernel hackers should be able to immediately spot something obvious
that i must be missing...

# gdb Xorg <pid>
(...)
Program received signal SIGINT, Interrupt.
0x00002b2e77d6d5bd in fork () from /lib64/libc.so.6

(gdb) disass
Dump of assembler code for function fork:
0x00002b2e77d6d530 <fork+0>: push %rbp
0x00002b2e77d6d531 <fork+1>: mov %rsp,%rbp
0x00002b2e77d6d534 <fork+4>: push %r14
0x00002b2e77d6d536 <fork+6>: push %r13
0x00002b2e77d6d538 <fork+8>: push %r12
0x00002b2e77d6d53a <fork+10>: push %rbx
0x00002b2e77d6d53b <fork+11>: sub $0x30,%rsp
0x00002b2e77d6d53f <fork+15>: mov 2807650(%rip),%rcx #
0x2b2e7801aca8 <__fork_handlers>
0x00002b2e77d6d546 <fork+22>: test %rcx,%rcx
0x00002b2e77d6d549 <fork+25>: mov %rcx,%rbx
0x00002b2e77d6d54c <fork+28>: je 0x2b2e77d6d576 <fork+70>
0x00002b2e77d6d54e <fork+30>: mov 0x28(%rcx),%edx
0x00002b2e77d6d551 <fork+33>: test %edx,%edx
0x00002b2e77d6d553 <fork+35>: je 0x2b2e77d6d546 <fork+22>
0x00002b2e77d6d555 <fork+37>: lea 0x1(%rdx),%esi
0x00002b2e77d6d558 <fork+40>: mov %edx,%eax
0x00002b2e77d6d55a <fork+42>: lock cmpxchg %esi,0x28(%rcx)
0x00002b2e77d6d55f <fork+47>: cmp %eax,%edx
0x00002b2e77d6d561 <fork+49>: je 0x2b2e77d6d7c2 <fork+658>
0x00002b2e77d6d567 <fork+55>: mov 2807610(%rip),%rcx #
0x2b2e7801aca8 <__fork_handlers>
0x00002b2e77d6d56e <fork+62>: test %rcx,%rcx
0x00002b2e77d6d571 <fork+65>: mov %rcx,%rbx
0x00002b2e77d6d574 <fork+68>: jne 0x2b2e77d6d54e <fork+30>
0x00002b2e77d6d576 <fork+70>: xor %r12d,%r12d
0x00002b2e77d6d579 <fork+73>: callq 0x2b2e77d44e30 <__GI__IO_list_lock>
0x00002b2e77d6d57e <fork+78>: mov %fs:0x90,%r9d
0x00002b2e77d6d587 <fork+87>: mov %fs:0x94,%r8d
0x00002b2e77d6d590 <fork+96>: mov %r8d,%eax
0x00002b2e77d6d593 <fork+99>: neg %eax
0x00002b2e77d6d595 <fork+101>: mov %eax,%fs:0x94
0x00002b2e77d6d59d <fork+109>: mov %fs:0x10,%r10
0x00002b2e77d6d5a6 <fork+118>: xor %edx,%edx
0x00002b2e77d6d5a8 <fork+120>: add $0x90,%r10
0x00002b2e77d6d5af <fork+127>: xor %esi,%esi
0x00002b2e77d6d5b1 <fork+129>: mov $0x1200011,%edi
0x00002b2e77d6d5b6 <fork+134>: mov $0x38,%eax
0x00002b2e77d6d5bb <fork+139>: syscall
0x00002b2e77d6d5bd <fork+141>: cmp $0xfffffffffffff000,%rax
0x00002b2e77d6d5c3 <fork+147>: ja 0x2b2e77d6d720 <fork+496>
0x00002b2e77d6d5c9 <fork+153>: test %eax,%eax
0x00002b2e77d6d5cb <fork+155>: mov %eax,%r14d

note the PC is pointing to <fork+141>, so i'm assuming it must be hanging
inside the kernel (syscall).

i believe the relevant code from glibc is the following:

pid_t
__libc_fork (void)
{
(... stripped ...)
_IO_list_lock (); <- note __GI__IO_list_lock in disass above!

#ifndef NDEBUG
pid_t ppid = THREAD_GETMEM (THREAD_SELF, tid);
#endif

/* We need to prevent the getpid() code to update the PID field so
that, if a signal arrives in the child very early and the signal
handler uses getpid(), the value returned is correct. */
pid_t parentpid = THREAD_GETMEM (THREAD_SELF, pid);
THREAD_SETMEM (THREAD_SELF, pid, -parentpid);

#ifdef ARCH_FORK
pid = ARCH_FORK ();
#else
# error "ARCH_FORK must be defined so that the CLONE_SETTID flag is used"
pid = INLINE_SYSCALL (fork, 0);
#endif

---

i think the syscall in disass above must be either ARCH_FORK or INLINE_SYSCALL.

using x86_64's unistd.h and converting eax 0x38 => 56 => __NR_clone. this is
funny because __NR_fork is 57 which is what i would expect. i386's unistd.h
yields even stranger syscall: __NR_mpx.

still, this is not the value currently loaded on rax:
(gdb) info reg
rax 0xfffffffffffffdff -513
rbx 0x0 0
rcx 0xffffffffffffffff -1
rdx 0x0 0
rsi 0x0 0
rdi 0x1200011 18874385
(...)

now the great trick:

(gdb) set $rax = 0x38
(gdb) quit
The program is running. Quit anyway (and detach it)? (y or n) y

-> done. xorg is good again.

do anybody have any idea on what is going on here?

please CC me if you have any comments...

thanks,

Miguel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/