On Sunday August 1, xuan--2004.08.01--linux-kernel--vger.kernel.org@xxxxxxxxxxx wrote:I disagree. :-)
Hello,
I have been extensively searching for documentation and mailing lists, but was yet unable to answer this question:
Does Linux software RAID 5 (or RAID 4) do ordered writes? (data stripes first, then parity stripes)
No, it doesn't impose any ordering between writes of parity and data
in the same stripe, and it would not have any material effect on any
outcomes if it did.
Yes, thank you, the clear statement, that writes are not ordered, helps. :-) It is also relieving to read that data blocks are always preferred to parity blocks, so that data blocks never can become scrambled by unmatching other data blocks and parity blocks (at least in non-degraded mode).
Because if the writes are not ordered, parity stripes could be written before data stripes. If the system crashes at this time, reconstruction will try to reconstruct the parity stripes by using the wrong (old) data stripes.
If the writes are ordered, crashes after the write of the data stripe but before the write to the parity stripe do not harm.
When the system crashes, the RAID5 manager assume that all data blocks
are correct and all parity blocks are suspect. It checks all parity
blocks against corresponding data and corrects those that don't
match.
If a write is "in progress" - i.e. it has started but not all data and
parity has been written, then either the "old" data or the "new" data
are equally correct. The only thing that needs to be guaranteed after
a crash, and the only thing that can be guaranteed, is that any data
that has been reported as "safe-in-storage" really is safe. That is
all journalling filesystems, or anything else, assume.
Hope that helps.
NeilBrownXuân Baldauf.