Re: [patch] document flash/RAID dangers

From: Ric Wheeler
Date: Tue Aug 25 2009 - 20:13:50 EST


On 08/25/2009 08:06 PM, Pavel Machek wrote:
On Tue 2009-08-25 19:48:09, Ric Wheeler wrote:

---
There are storage devices that high highly undesirable properties
when they are disconnected or suffer power failures while writes are
in progress; such devices include flash devices and MD RAID 4/5/6
arrays. These devices have the property of potentially
corrupting blocks being written at the time of the power failure, and
worse yet, amplifying the region where blocks are corrupted such that
additional sectors are also damaged during the power failure.

I would strike the entire mention of MD devices since it is your
assertion, not a proven fact. You will cause more data loss from common

That actually is a fact. That's how MD RAID 5 is designed. And btw
those are originaly Ted's words.


Ted did not design MD RAID5.

events (single sector errors, complete drive failure) by steering people
away from more reliable storage configurations because of a really rare
edge case (power failure during split write to two raid members while
doing a RAID rebuild).

I'm not sure what's rare about power failures. Unlike single sector
errors, my machine actually has a button that produces exactly that
event. Running degraded raid5 arrays for extended periods may be
slightly unusual configuration, but I suspect people should just do
that for testing. (And from the discussion, people seem to think that
degraded raid5 is equivalent to raid0).

Power failures after a full drive failure with a split write during a rebuild?


Otherwise, file systems placed on these devices can suffer silent data
and file system corruption. An forced use of fsck may detect metadata
corruption resulting in file system corruption, but will not suffice
to detect data corruption.


This is very misleading. All storage "can" have silent data loss, you are
making a statement without specifics about frequency.

substitute with "can (by design)"?

By Pavel's unproven casual observation?


Now, if you can suggest useful version of that document meeting your
criteria?

Pavel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/