Re: [patch] ext2/3: document conditions when reliable operation ispossible

From: Pavel Machek
Date: Tue Aug 25 2009 - 19:45:14 EST



> While I think it is, in principle, worth documenting this sort of
> thing, there are an awful lot of fine details and distinctions that
> would need to be considered.

Ok, can you help? Having a piece of MD documentation explaining the
"powerfail nukes entire stripe" and how current filesystems do not
deal with that would be nice, along with description when exactly that
happens.

It seems to need two events -- one failed disk and one powerfail. I
knew that raid5 only protects against one failure, but I never
realized that simple powerfail (or kernel crash) counts as a failure
here, too.

I guess it should go at the end of md.txt.... aha, it actually already
talks about the issue a bit, in:

#Boot time assembly of degraded/dirty arrays
#-------------------------------------------
#
#If a raid5 or raid6 array is both dirty and degraded, it could have
#undetectable data corruption. This is because the fact that it is
#'dirty' means that the parity cannot be trusted, and the fact that it
#is degraded means that some datablocks are missing and cannot reliably
#be reconstructed (due to no parity).

(Actually... that's possibly what happened to friend of mine. One of
disks in raid5 stopped responding and whole system just hanged
up. Oops, two failures in one...)
Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/