Re: [CRASH] gdth / __block_prepare_write: zeroing uptodate buffer! / NMI Watchdog detected LOCKUP

From: Andrew Morton (akpm@zip.com.au)
Date: Tue Feb 26 2002 - 14:04:55 EST

Next message: Simon Turvey: "Re: IDE error on 2.4.17"
Previous message: Mike Kravetz: "Re: [Lse-tech] NUMA scheduling"
In reply to: Florian Lohoff: "[CRASH] gdth / __block_prepare_write: zeroing uptodate buffer! / NMI Watchdog detected LOCKUP"
Next in thread: Florian Lohoff: "Re: [CRASH] gdth / __block_prepare_write: zeroing uptodate buffer! / NMI Watchdog detected LOCKUP"
Reply: Florian Lohoff: "Re: [CRASH] gdth / __block_prepare_write: zeroing uptodate buffer! / NMI Watchdog detected LOCKUP"
Reply: Andrea Arcangeli: "Re: [CRASH] gdth / __block_prepare_write: zeroing uptodate buffer! / NMI Watchdog detected LOCKUP"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Florian Lohoff wrote:
>
> Hi,
> i have been looking for deadlocks we are experiencing on a couple of
> SMP machines (Dual Celeron and Dual PIII). After a night stressing a
> spare machine with dbench/bonnie++/tcpspray the machine locked up 30
> minutes after i killed the test. The last messages on the console were:
>
> __block_prepare_write: zeroing uptodate buffer!

Yup. This happens when the disk fills up. Andrea and I were
discussing it over the weekend. There's a new patch in the -aa
kernels which doesn't quite fix it :(

We'll fix it in 2.4.19-pre somehow. It's possible that this problem
causes a chnuk of zeroes to be written into the file when you hit
ENOSPC, which is rather rude. But your file was truncated anyway.

> 15 times - Machine was answering ping first but stopped after a couple
> of minutes. Another couple of minutes later the nmi_watchdog stepped in
> and produced an oops:

SCSI error recovery deadlocked.

Now it's *just* conceivable that the __block_prepare_write() problem
caused a junk request to be sent down to the driver, which caused
the driver to enter recovery, which it then screwed up. But I
doubt it.

It's also conceivable that the NMI watchdog code itself caused
problems also. Back in the days when it was permanently enabled,
some machines kept going silly until nmi watchdog was enabled.

Bottom line: I don't know why you got a SCSI error, and the lockup
is possibly a bug in the scsi layer.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Simon Turvey: "Re: IDE error on 2.4.17"
Previous message: Mike Kravetz: "Re: [Lse-tech] NUMA scheduling"
In reply to: Florian Lohoff: "[CRASH] gdth / __block_prepare_write: zeroing uptodate buffer! / NMI Watchdog detected LOCKUP"
Next in thread: Florian Lohoff: "Re: [CRASH] gdth / __block_prepare_write: zeroing uptodate buffer! / NMI Watchdog detected LOCKUP"
Reply: Florian Lohoff: "Re: [CRASH] gdth / __block_prepare_write: zeroing uptodate buffer! / NMI Watchdog detected LOCKUP"
Reply: Andrea Arcangeli: "Re: [CRASH] gdth / __block_prepare_write: zeroing uptodate buffer! / NMI Watchdog detected LOCKUP"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Thu Feb 28 2002 - 21:00:31 EST