Re: raid is dangerous but that's secret (was Re: [patch] ext2/3: document conditions when reliable operation is possible)

From: Ric Wheeler
Date: Mon Aug 31 2009 - 14:00:30 EST


On 08/31/2009 01:49 PM, Jesse Brandeburg wrote:
On Sun, Aug 30, 2009 at 8:20 AM, Theodore Tso<tytso@xxxxxxx> wrote:
So we *do* have the warning light; the problem is that just as some
people may not realize that "check brakes" means, "YOU COULD DIE",
some people may not realize that "hard drive failure; RAID array
degraded" could mean, "YOU COULD LOSE DATA".

Fortunately, for software RAID, this is easily solved; if you are so
concerned, why don't you submit a patch to mdadm adjusting the e-mail
sent to the system administrator when the array is in a degraded
state, such that it states, "YOU COULD LOSE DATA". I would gently
suggest to you this would be ***far*** more effective that a patch to
kernel documentation.

In the case of a degraded array, could the kernel be more proactive
(or maybe even mdadm) and have the filesystem remount itself withOUT
journalling enabled? This seems on the surface to be possible, but I
don't know the internal particulars that might prevent/allow it.

This a misconception - with or without journalling, you are open to a second failure during a RAID rebuild.

Also note that by default, ext3 does not mount with barriers turned on.

Even if you mount with barriers, MD5 does not handle barriers, so you stand to lose a lot of data if you have a power outage.

Ric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/