Re: 2.4.31 hangs, no information on console or serial port

From: Willy Tarreau
Date: Tue Feb 21 2006 - 16:44:05 EST


On Tue, Feb 21, 2006 at 11:04:57AM -0500, David Golombek wrote:
> Benjamin LaHaise <bcrl@xxxxxxxxx> writes:
> > On Tue, Feb 21, 2006 at 10:23:56AM -0500, David Golombek wrote:
> > > Any suggestions as to how we might debug this or possible causes would
> > > be greatly appreciated.
> >
> > Have you tried turning on the NMI watchdog (nmi_watchdog=1)? It
> > should be able to kick the machine out of the locked state, as these
> > symptoms would hint at a spinlock deadlock with interrupts disabled.
> > Also, try to reproduce on the latest 2.4.33pre. That said, for an
> > io intensive workload like you're running, 2.6 is much better,
> > especially for systems using highmem.
>
> I'll enable nmi_watchdog as soon as we can bring the machine down,
> thanks for the excellent suggestion. I'd entirely forgotten about the
> watchdog. I'll try to switch to 2.4.33pre out as soon as poosible, it
> certainly has several fixes we've been waiting for. 2.6 is still a
> ways off, lots of qualification work to do.

BTW, if your console blanks, you should use this :

# setterm -blank 0

Maybe you'll notice some "OOM: killing process" messages indicating
that some hungry process is going mad (possibly the NFS server).

> Thanks,
> Dave

Regards,
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/