Re: [patch] document flash/RAID dangers

From: Ric Wheeler
Date: Tue Aug 25 2009 - 20:51:11 EST


On 08/25/2009 08:44 PM, Pavel Machek wrote:

THESE devices have the property of potentially corrupting blocks being
written at the time of the power failure,

this is true of all devices

Actually I don't think so. I believe SATA disks do not corrupt even
the sector they are writing to -- they just have big enough
capacitors. And yes I believe ext3 depends on that.

Pavel, no S-ATA drive has capacitors to hold up during a power failure
(or even enough power to destage their write cache). I know this from
direct, personal knowledge having built RAID boxes at EMC for years. In
fact, almost all RAID boxes require that the write cache be hardwired to
off when used in their arrays.

I never claimed they have enough power to flush entire cache -- read
the paragraph again. I do believe the disks have enough capacitors to
finish writing single sector, and I do believe ext3 depends on that.

Pavel

Some scary terms that drive people mention (and measure):

"high fly writes"
"over powered seeks"
"adjacent tack erasure"

If you do get a partial track written, the data integrity bits that the data is embedded in will flag it as invalid and give you and IO error on the next read. Note that the damage is not persistent, it will get repaired (in place) on the next write to that sector.

Also it is worth noting that ext2/3/4 write file system "blocks" not single sectors. Each ext3 IO is 8 distinct disk sector writes and those can span tracks on a drive which require a seek which all consume power.

On power loss, a disk will immediately park the heads...

ric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/