Re: [2.6.9] NMI watchdog detected lockup.

From: Aristeu Rozanski
Date: Thu Mar 19 2009 - 09:38:19 EST


Hi Pawel,
> we're currently testing trial version of Jungo pci driver for
> linux/windows
> (http://www.jungo.com/st/windriver_usb_pci_driver_development_software.html)
> and get 'NMI Watchdog detected LOCKUP' on athlon64/opteron smp systems
> with rhel 2.6.9 kernel. from the other side, the lockup doesn't occur
> on intel x86_64 smp systems. bad news is that Jungo developers can't
> reproduce
> the lockup while we can trig it during simple pci bus scanning/opening
> device.
>
> only diagnostic we have is console log grabbed over rs232 link.
first: 2.6.9 is an old kernel, so I suppose you're using a vendor
kernel.
second: I can't see any obvious link to download the source code for
that driver. we can't help you with binary modules, contact both the kernel
vendor and the driver vendor.

> NMI Watchdog detected LOCKUP, CPU=0, registers:
something (probably the Jungo driver) is using too much time and the NMI
watchdog is not being reset often enough. try booting with nmi_watchdog=0
kernel parameter and check if the driver works. if the machine hangs, it means
that the NMI watchdog was right and the CPU was really stuck because of a bug.
contact your vendors and inform the results.

--
Aristeu

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/