Re: [patch] ext2/3: document conditions when reliable operation ispossible

From: Rik van Riel
Date: Wed Aug 26 2009 - 00:08:40 EST


Pavel Machek wrote:

Ok, can you help? Having a piece of MD documentation explaining the
"powerfail nukes entire stripe" and how current filesystems do not
deal with that would be nice, along with description when exactly that
happens.

Except of course for the inconvenient detail that a power
failure on a degraded RAID 5 array does *NOT* nuke the
entire stripe.

A 5-disk RAID 5 array will have 4 data blocks and 1 parity
block in each stripe. A degraded array will have either
4 data blocks or 3 data blocks and 1 parity block in the
stripe.

If we are dealing with a parity-less stripe, we cannot
lose any data due to RAID 5, because each of the 4 data
blocks has a disk block available. We could still lose
a data write due to a power failure, but this could also
happen with the RAID 5 array still intact.

If we are dealing with a 3-data, 1-parity stripe, then
3 of the 4 data blocks have an available disk block and
will not be lost (if they make it to disk). The only
block that maintains on all 3 data blocks and the parity
block being correct is the block that does not currently
have a disk to be written to.

In short, if a stripe is not written completely on a
degraded RAID 5 array, you can lose:
1) the blocks that were not written (duh)
2) the block that doesn't have a disk

The first part of this loss is also true in a non-degraded
RAID 5 array. The fact that the array is degraded really
does not add much additional data loss here and you certainly
will not lose the entire stripe like you suggest.

--
All rights reversed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/