Re: [patch] ext2/3: document conditions when reliable operation ispossible

From: david
Date: Wed Aug 26 2009 - 07:29:05 EST


On Wed, 26 Aug 2009, Pavel Machek wrote:

On Wed 2009-08-26 06:39:14, Ric Wheeler wrote:
On 08/25/2009 10:58 PM, Theodore Tso wrote:
On Tue, Aug 25, 2009 at 09:15:00PM -0400, Ric Wheeler wrote:

I agree with the whole write up outside of the above - degraded RAID
does meet this requirement unless you have a second (or third, counting
the split write) failure during the rebuild.

The argument is that if the degraded RAID array is running in this
state for a long time, and the power fails while the software RAID is
in the middle of writing out a stripe, such that the stripe isn't
completely written out, we could lose all of the data in that stripe.

In other words, a power failure in the middle of writing out a stripe
in a degraded RAID array counts as a second failure.
To me, this isn't a particularly interesting or newsworthy point,
since a competent system administrator who cares about his data and/or
his hardware will (a) have a UPS, and (b) be running with a hot spare
and/or will imediately replace a failed drive in a RAID array.

I agree that this is not an interesting (or likely) scenario, certainly
when compared to the much more frequent failures that RAID will protect
against which is why I object to the document as Pavel suggested. It
will steer people away from using RAID and directly increase their
chances of losing their data if they use just a single disk.

So instead of fixing or at least documenting known software deficiency
in Linux MD stack, you'll try to surpress that information so that
people use more of raid5 setups?

Perhaps the better documentation will push them to RAID1, or maybe
make them buy an UPS?

people aren't objecting to better documentation, they are objecting to misleading documentation.

for flash drives the danger is very straightforward (although even then you have to note that it depends heavily on the firmware of the device, some will loose lots of data, some won't loose any)

a good thing to do here would be for someone to devise a test to show this problem, and then gather the results of lots of people performing this test to see what the commonalities are.

you are generalizing that since you have lost data on flash drives, all flash drives are dangerous.

what if it turns out that only one manufacturer is doing things wrong? you will have discouraged people from using flash drives for no reason. (potentially causing them to loose data becouse they ae scared away from using flash drives and don't implement anything better)

to be safe, all that a flash drive needs to do is to not change the FTL pointers until the data has fully been recorded in it's new location. this is probably a trivial firmware change.


for raid arrays, we are still learning the nuances of what actually can happen. the comment that Rik made a few hours ago when he pointed out that with raid 5 you won't trash the entire stripe (which is what I thought happened from prior comments), but instead run the risk of loosing two relativly definable chunks of data

1. the block you are writing (which you can loose anyway)

2. the block that would live on the disk that is missing.

that drasticly lessens the impact of the problem

I would like to see someone explain what would happen on raid 6, and I think that the possibilities that Neil talked about where he said that it was possible to try the various combinations and see which ones agree with each other would be a good thing to implement if he can do so.

but the super simplified statement you keep trying to make is significantly overstating and oversimplifying the problem.

David Lang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/