Re: ext2_free_blocks (fwd)

From: Theodore Y. Ts'o (tytso@MIT.EDU)
Date: Thu Jan 20 2000 - 08:10:08 EST


   Date: Thu, 20 Jan 2000 11:46:11 +0100 (MET)
   From: R.E.Wolff@BitWizard.nl (Rogier Wolff)

>>>[..] Since fsck -f didn't find any problems, it was probably your
>>>hardware hiccuping and returning the wrong block when ext2 tried to read
>>in an indirect block. [..]
>>
>>The corrupted metadata block is been freed before fsck had a chance to run
>>and once the metadata block become an unused block all returned fine.
>>
>How do you know that? It's a plausible scenario, but only if someone
>deleted the file containing the bad indrect block before running fsck.

   The kernel would've spit a bunch of "freeing block outside datazone" in
   that case.

Very true. I usually recommend that people, when they see that error,
to immediately reboot and run fsck, since to not do that risks further
filesystem corruption. I'm going to be submitting patches which changes
these from ext2_warnings to ext2_errors, so that (depending on how your
filesystem's error-behaviour has been set using tune2fs) this will
result in either a panic/reboot, remounting the filesystem read-only, or
ignoring the error. Right now, ignoring it is the only thing that
happens, and people can get their filesystems pretty badly toasted as a
result.

I've seen a number of cases where such errors are reported, but then
when you run fsck, the filesystem is clean. This *usually* means a
hardware problem where the wrong block was transferred by the hardware
(say a bad cable caused one or two bits to be flipped in the block
address). It can mean a severe buffer cache problem or some other
kernel bug, although usually we get a lot more complaints if that is the
case. What I find is that people usually are completely unwilling to
believe that their hardware might be at fault, which is why I usually
mention this first as a possibility. It is important for us as kernel
developers to try to see patterns in the bug reports, and consider the
possibility that it might be our fault, of course.

                                                - Ted

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Jan 23 2000 - 21:00:22 EST