Re: [patch] ext2/3: document conditions when reliable operation ispossible

From: Pavel Machek
Date: Mon Aug 24 2009 - 17:25:37 EST


>> I can reproduce data loss with ext3 on flashcard in about 40
>> seconds. I'd not call that "odd event". It would be nice to handle
>> that, but that is hard. So ... can we at least get that documented
>> please?
> Part of documenting best practices is to put down very specific things
> that do/don't work. What I worry about is producing too much detail to
> be of use for real end users.

Well, I was trying to write for kernel audience. Someone can turn that
into nice end-user manual.

> I have to admit that I have not paid enough attention to this specifics
> of your ext3 + flash card issue - is it the ftl stuff doing out of order
> IO's?

The problem is that flash cards destroy whole erase block on unplug,
and ext3 can't cope with that.

>> _All_ flash cards (MMC, USB, SD) had the problems. You don't need to
>> get clear grasp on trends. Those cards just don't meet ext3
>> expectations, and if you pull them, you get data loss.
> Pull them even after an unmount, or pull them hot?

Pull them hot.

[Some people try -osync to avoid data loss on flash cards... that will
not do the trick. Flashcard will still kill the eraseblock.]

>>> Nothing is perfect. It is still a trade off between storage
>>> utilization (how much storage we give users for say 5 2TB drives),
>>> performance and costs (throw away any disks over 2 years old?).
>> "Nothing is perfect"?! That's design decision/problem in raid5/ext3. I
>> believe that should be at least documented. (And understand why ZFS is
>> interesting thing).
> Your statement is overly broad - ext3 on a commercial RAID array that
> does RAID5 or RAID6, etc has no issues that I know of.

If your commercial RAID array is battery backed, maybe. But I was
talking Linux MD here.

>> And I still use my zaurus with crappy DRAM.
>> I would not trust raid5 array with my data, for multiple
>> reasons. The fact that degraded raid5 breaks ext3 assumptions should
>> really be documented.
> Again, you say RAID5 without enough specifics. Are you pointing just at
> MD RAID5 on S-ATA? Hardware RAID cards? A specific commercial RAID5
> vendor?

Degraded MD RAID5 on anything, including SATA, and including
hypothetical "perfect disk".

>> The papers show failures in "once a year" range. I have "twice a
>> minute" failure scenario with flashdisks.
>> Not sure how often "degraded raid5 breaks ext3 atomicity" would bite,
>> but I bet it would be on "once a day" scale.
>> We should document those.
> Documentation is fine with sufficient, hard data....

Degraded MD RAID5 does not work by design; whole stripe will be
damaged on powerfail or reset or kernel bug, and ext3 can not cope
with that kind of damage. [I don't see why statistics should be
neccessary for that; the same way we don't need statistics to see that
ext2 needs fsck after powerfail.]
(cesky, pictures)
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at