Re: document ext3 requirements

From: Matthias Andree
Date: Tue Jan 06 2009 - 05:36:36 EST


On Sun, 04 Jan 2009, Rob Landley wrote:

> On Sunday 04 January 2009 17:30:52 Theodore Tso wrote:
> > On Sun, Jan 04, 2009 at 11:40:52PM +0100, Pavel Machek wrote:
> > > Not neccessarily.
> > >
> > > If I have a bit of precious data and lot of junk on the card, I want
> > > to copy out the precious data before the card dies. Reading the whole
> > > media may just take too long.
> > >
> > > That's probably very true for rotating harddrives after headcrash...
> >
> > For a small amount data, maybe; but the number of seeks is often far
> > more destructive than the amount of time the disk is spinning. And in
> > practice, what generally happens is the user starts looking around to
> > make sure there wasn't anything else on the disk worth saving, and now
> > data is getting copied off based on human reaction time. So that's
> > why I normally advise users that doing a full image copy of the disk
> > is much better than, say, "cp -r /home/luser /backup", or cd'ing
> > around a filesystem hierarchy and trying to save files one by one.
>
> That would be true if the disk hardware wasn't doing a gazillion retries to
> read a bad sector internally (taking 5 seconds to come back and report
> failure), and then the darn scsi layer added another gazillion retries on top
> of that, and the two multiply together to make it so slow that that when you
> leave the thing copying the disk overnight it's STILL not done 24 hours later.
> Going in and cherry picking individual files looks kind of appealing in that
> situation.

Well, I recently (Dec 1st or so) had a venerable HDD fail with a couple
of bad sectors; with oldish backups (couple of days) (Samsung SP2004C
plugged to a VIA VT6420 in the south bridge, VT8237).

I couldn't use dd_rescue since the disk drive was detached by the OS
upon the disk's hitting the first error.

If it's a software or hardware fault I cannot say, FreeBSD
7.1-PRERELEASE showed the same behaviour as did the openSUSE 11.0 i686
kernels, but then again it might be either OS losing patience with the
drive doing excessive reads, or the drive actually violating the bus
protocols or hanging. It didn't need power cycling though, detaching and
reattaching the ATA bus was sufficient.

For me, recovery was possible with rsync (or cp -Rp): run rsync -avH
until the drive froze, figure which file was affected, note the name,
remount the drive, rm the affected file, and continue.

Is there a "don't retry reads" setting in the kernel that I miss?

(I still have the drive, so I can try some error handling patches if
desired.)

--
Matthias Andree
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/