Re: ext4 features (checksums)

From: Avi Kivity
Date: Tue Jul 04 2006 - 04:24:47 EST


Neil Brown wrote:

On Tuesday July 4, avi@xxxxxxxxxx wrote:
> Neil Brown wrote:
> >
> > To my mind, the only thing you should put between the filesystem and
> > the raw devices is RAID (real-raid - not raid0 or linear).
> >
> I believe that implementing RAID in the filesystem has many benefits too:
> - multiple RAID levels: store metadata in triple-mirror RAID 1, random
> write intensive data in RAID 1, bulk data in RAID 5/6
> - improved write throughput - since stripes can be variable size, any
> large enough write fills a whole stripe

Maybe....

Now imagine what would be required to rebuild a whole drive onto a
spare after a drive failure.

I'm sure it is possible, and I believe ZFS does something like that.
I find it hard to imagine getting reasonable speed if there is much
complexity. And the longer it takes, the longer your data is exposed
to multiple-failures.


A company called Isilon does this on a cluster. They claim (IIRC) a one hour rebuild time for a failure. AFAIK they rebuild into cluster free space, so they are not bound by the spare's bandwidth; they can utilize all cluster resources for a rebuild.

(You don't need spare disks, just spare free space; so you don't have idle disk heads)

In terms of complexity, I imagine one needs a reverse mapping (extent -> (inode, offset)); given that, one can very easily rebuild failed disks, and more features are easy to implement, like evacuation of a drive, or rebalancing data across all drives when new disks are added.

The same ideas can be applied to a non-clustered filesystem, of course.

--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/