Re: [PATCH] Documentation/oops-tracking update

Andi Kleen (ak@muc.de)
Tue, 28 Dec 1999 13:37:11 +0100


On Mon, Dec 27, 1999 at 11:33:10PM +0100, Pete Wyckoff wrote:
> manfreds@colorfullife.com said:
> > From: Andi Kleen <ak@muc.de>
> > > I think it is bad advice, because a lot of problems only happen in SMP
> > > mode or under high load that may need two cpus. The best advice is IMHO
> > > to put a serial console onto it and log oopses from another computer.
> > > In case of deadlocks the NMI oopser in 2.4 should solve that problem,
> > > for 2.2 it may be useful to give a pointer to it.
> > >
> > Yesterday I tracked a bug in the SCSI layer: oops, the current CPU owned the
> > io_request_lock:
> >
> > * oops.
> > * a few seconds later: lock-up on the io_request_lock.
> > * NMI oops detection: several additional oops'es, the screen scrolls down,
> > ie you cannot copy it with paper&pencil.
> > * was not logged to the disk because the io_request_lock was missing.
> > * no serial console, no line printer was available.
>
> As useful as the nmi watchdog can be, it's also handy to disable it at
> times for cases like the above. Boot with "nmi_watchdog=0" to keep the
> watchdog from scrolling away the _real_ oops.

That is IMHO a bug in the nmi oopser. A panic/oops should set a flag that
prevents the watchdog oops. This would "fix" the oops when the kernel
for some reason cannot find the root fs and panics.

Sometimes it is useful though, because panics usually don't give a backtrace.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/