Re: Fw: Re: 2.6.21-rc5-mm3

From: Andi Kleen
Date: Fri Mar 30 2007 - 15:31:54 EST



>
> BUG: NMI Watchdog detected LOCKUP on CPU0, eip c014ce9c, registers:

I suspect it is just because his console is too slow and then unwinding
took too long and it happened to hit the unwinder.

You did use a slow console, right?

I suppose it just needs a strategic touch_nmi_watchdog. Will add that.

> Modules linked in: ide_cd cdrom rtc unix
> CPU: 0
> EIP: 0060:[<c014ce9c>] Not tainted VLI
> EFLAGS: 00000093 (2.6.21-rc5-mm3 #10)
> EIP is at read_pointer+0x49/0x2d8
> eax: c7a8dd04 ebx: 00000000 ecx: 00000000 edx: c043d19c
> esi: c043d184 edi: c043d19c ebp: c7a8dbf4 esp: c7a8dbbc
> ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068
> Process udevd (pid: 864, ti=c7a8c000 task=c97cd4d0 task.ti=c7a8c000)
> Stack: 0000000b c7a8dc70 c7a8dbf4 c014d6d0 c7a8dd04 c043e454 c7a8dbf4 c011fdd1
> c043e404 c043e454 c043d190 0005cd94 c043d184 c7a8dd04 c7a8dd14 c014dd3a
> 00000000 00000000 00000000 00000000 c7a8df44 c7a8dd74 c741eefb 00000008
> Call Trace:
> [<c014dd3a>] unwind+0x414/0xfa2
> [<c010510d>] dump_trace_unwind+0xb4/0xe5
> [<c014ce4d>] unwind_init_running+0x25/0x2b
> [<c01051a1>] dump_trace+0x63/0x1eb
> [<c010ad39>] save_stack_trace+0x23/0x42
> [<c01393c9>] update_cpu_base_expires_next+0x56/0x5a
> [<c013a475>] hrtimer_interrupt+0x17c/0x1b8
> [<c0115e8e>] smp_apic_timer_interrupt+0x72/0x85
> [<c0104bef>] apic_timer_interrupt+0x33/0x38
> [<c014320c>] lock_release+0x1d2/0x1da
> [<c013a7ef>] up_read+0x19/0x2e
> [<c011b800>] do_page_fault+0x28f/0x55b
> [<c034d191>] error_code+0x79/0x80
> [<c020c23e>] __put_user_4+0x12/0x18
> DWARF2 unwinder stuck at __put_user_4+0x12/0x18
> Leftover inexact backtrace:
> [<c01040d6>] ret_from_fork+0x6/0x1c

Hmpf. I saw it once in child_rip here too. Then I wanted to reproduce it to report
properly and couldn't again. I had a few other backtraces that were all non stuck
with child_rip then on essentially the same kernel. Something weird is going on.

> [<c013a475>] hrtimer_interrupt+0x17c/0x1b8
> [<c0115e8e>] smp_apic_timer_interrupt+0x72/0x85
> [<c0104bef>] apic_timer_interrupt+0x33/0x38
> [<c020c184>] __get_user_4+0x14/0x17
> DWARF2 unwinder stuck at __get_user_4+0x14/0x17
> Leftover inexact backtrace:

Now that is weird. Never seen before. Jan, any ideas?

What is your gcc/compiler, Michal?

-Andi


Were there any strange binutils in use, Michal?

> [<c018b42f>] do_execve+0xdd/0x210
> [<c0102497>] sys_execve+0x3f/0x62
> [<c01041c2>] sysenter_past_esp+0x5f/0x99
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/