debugging a deadlock

From: Janne Karhunen
Date: Fri Sep 08 2006 - 03:33:48 EST


Hi,

Sometimes you have to do strange things such as trying to debug occasional
deadlock of a system that has been in use for long, long time. So please, no
nasty comments about outdated system with no soft lock-up detection and
such :/

Anyhoo, it appears to be infinite semaphore wait. By modifying the semaphores
to dump stack on LONG wait I managed to get a stack trace. Looks like this:

kernel: [printk+340/384] [printk+340/384] [show_trace+203/240]
[show_trace+203/240] [show_stack+113/120] [show_registers+223/324]
kernel: [__down+147/276] [__down_failed+8/12]
[stext_lock+13538/52069] [error_code+16/64] [system_call+66/76]

Umm, this error_code thing is beyond my current knowledge. What's this?
Some sort of assembly-glued exception handling? Any ideas how to figure
out which semaphore this is?


--
// Janne
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/