Re: Serious ext2fs problem

David S. Miller (davem@darkside.rutgers.edu)
Tue, 31 Dec 1996 16:29:42 -0500


Date: Tue, 31 Dec 1996 19:08:37 GMT
From: "Stephen C. Tweedie" <sct@dcs.ed.ac.uk>

This is almost certainly bad hardware, either the disk or the SCSI
bus. I'd check your cables and try pushing the bus speed down to 5
MHz. I have seen almost identical symptoms on a news server
myself, and it turned out to be hardware. The reason you see the
free blocks count message is just that this is one place that the
kernel is in a position to do some self-checking; single-bit errors
on the SCSI bus in other blocks won't normally be detected.

This is bogus Stephen. It is my opinion that any scsi driver which
does not check parity in all transactions and runs disks using
synchronous transfers is broken and needs to be fixed. When you are
running at 10MHZ and higher you are playing russian roulette with your
scsi chain if you aren't checking parity for all transfers.

I cannot in good conscious encourage someone to place their critical
data on a scsi disk when the driver is not checking parity, therefore
I made certain that all supported scsi controllers on the sparc
platform are entirely anal about parity checking on the scsi bus in
all cases and configurations.

Many of the Intel drivers are not like this, they disable or don't
even check parity errors reported by the controller. This is a
disaster waiting to happen and I'd like to encourage all of the scsi
driver maintainers to fix this problem.

---------------------------------------------////
Yow! 11.26 MB/s remote host TCP bandwidth & ////
199 usec remote TCP latency over 100Mb/s ////
ethernet. Beat that! ////
-----------------------------------------////__________ o
David S. Miller, davem@caip.rutgers.edu /_____________/ / // /_/ ><