Re: [patch] document flash/RAID dangers

From: Chris Adams
Date: Wed Aug 26 2009 - 09:40:53 EST

Once upon a time, Theodore Tso <tytso@xxxxxxx> said:
>Well, the softwar raid layer could be improved so that it implements
>scrubbing by default (i.e., have the md package install a cron job to
>implement a periodict scrub pass automatically).

Fedora 11 added a cron job to kick off a RAID check for each Linux MD
RAID array every week. Combined with running mdmonitor, root will get
an email on any failure.

The other thing about this thread is that the only RAID implementation
that is being discussed here is the MD RAID stack. There are a lot of
RAID implementations that have the same issues:

- motherboard (aka "fake") RAID - In Linux this is typically mapped with
device mapper via dmraid; AFAIK there is not a tool to scrub (or even
monitor the status of and notify on failure) a Linux DM RAID setup.

- hardware RAID cards without battery backup - these have the exact same
issues because they cannot guarantee all writes complete, nor can they
keep track of incomplete writes across power failures

- hardware RAID cards _with_ battery backup but that don't periodically
test the battery and have a way to notify you of battery failure while
Linux is running

The issues being raised here are not specific to extX, MD RAID, or Linux
at all; they are problems with non-"enterprise-class" RAID setups.
There's a reason enterprise-class RAID costs a lot more money than the
card you can pick up at Fry's.

There's no reason to document the design issues of general RAID
implementations in the Linux kernel.
Chris Adams <cmadams@xxxxxxxxxx>
Systems and Network Administrator - HiWAAY Internet Services
I don't speak for anybody but myself - that's enough trouble.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at