Re: [patch] document flash/RAID dangers

From: Pavel Machek
Date: Wed Aug 26 2009 - 07:25:53 EST

On Tue 2009-08-25 18:19:40, david@xxxxxxx wrote:
> On Wed, 26 Aug 2009, Pavel Machek wrote:
>>>>>> THESE devices have the property of potentially corrupting blocks being
>>>>>> written at the time of the power failure,
>>>>> this is true of all devices
>>>> Actually I don't think so. I believe SATA disks do not corrupt even
>>>> the sector they are writing to -- they just have big enough
>>>> capacitors. And yes I believe ext3 depends on that.
>>> Pavel, no S-ATA drive has capacitors to hold up during a power failure
>>> (or even enough power to destage their write cache). I know this from
>>> direct, personal knowledge having built RAID boxes at EMC for years. In
>>> fact, almost all RAID boxes require that the write cache be hardwired to
>>> off when used in their arrays.
>> I never claimed they have enough power to flush entire cache -- read
>> the paragraph again. I do believe the disks have enough capacitors to
>> finish writing single sector, and I do believe ext3 depends on that.
> keep in mind that in a powerfail situation the data being sent to the
> drive may be corrupt (the ram gets flaky while a DMA to the drive copies
> the bad data to the drive, which writes it before the power loss gets bad
> enough for the drive to decide there is a problem and shutdown)
> you just plain cannot count on writes that are in flight when a powerfail
> happens to do predictable things, let alone what you consider sane or
> proper.

>From what I see, this kind of failure is rather harder to reproduce
than the software problems. And at least SGI machines were designed to
avoid this...

Anyway, I'd like to hear from ext3 people... what happens on read
errors in journal? That's what you'd expect to see in situation above.
(cesky, pictures)
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at