Overagressive failing of disk reads, both LIBATA and IDE

From: Norman Diamond
Date: Thu Mar 19 2009 - 23:21:32 EST


For months I was wondering how a disk could do this:
dd if=/dev/hda of=/dev/null bs=512 skip=551540 count=4 # succeeds
dd if=/dev/hda of=/dev/null bs=512 skip=551544 count=4 # succeeds
dd if=/dev/hda of=/dev/null bs=512 skip=551540 count=8 # fails

It turns out the disk isn't doing that. Linux is. The old IDE drivers did
it, but with LIBATA the same thing happens to /dev/sda. In later examples
also, the same happens to /dev/sda as /dev/hda.

Here's what the disk is really responsible for:
dd if=/dev/hda of=/dev/null bs=512 skip=551562 count=1 # really fails

Here's Linux to blame again:
dd if=/dev/hda of=/dev/null bs=512 skip=551561 count=1 # fails

When the drive reports an uncorrectable media error, Linux correctly records
it in the log. But when the app didn't ask for that block, when blocks that
the app asked for were all read, Linux incorrectly reports failure to the
app.

I don't know how Linux decides how many blocks to read ahead, but no matter
how many it chooses, read ahead is read ahead. Go ahead and record it in
the log. I'd also like to suggest that if a user is logged in on the screen
(whether X11 or text) see if we can warn them that their disk is dying. But
don't return a failure to the app. If the blocks that the app asked for
were read, we should give them to the app, successfully.

Sheesh.

P.S.
One would expect this to persuade the hard drive to relocate the block:
dd if=/dev/zero of=/dev/hda bs=512 seek=551562 count=1
But it doesn't because Linux wants to read 4 blocks, modify 1, and write 4
blocks. The read fails.

One would expect this to persuade the hard drive to relocate the block:
dd if=/dev/zero of=/dev/hda bs=512 seek=551560 count=4
But it doesn't because the hard drive reports success. If an app tries to
read the bad sector again it still fails. So the drive has egregiously bad
firmware. That doesn't excuse Linux.

--------------------------------------
Power up the Internet with Yahoo! Toolbar.
http://pr.mail.yahoo.co.jp/toolbar/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/