Re: raid0 slower than devices it is assembled of?

From: jw schultz
Date: Wed Dec 17 2003 - 21:20:38 EST

Next message: David Brownell: "Re: Handling of bounce buffers by rh_call_control"
Previous message: Linda Xie: "Re: PATCPATCH -- add unlimited name lengths support to sysfs"
In reply to: bill davidsen: "Re: raid0 slower than devices it is assembled of?"
Next in thread: Andre Hedrick: "Re: raid0 slower than devices it is assembled of?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Dec 17, 2003 at 10:29:18PM +0000, bill davidsen wrote:
> In article <Pine.LNX.4.58.0312160825570.1599@xxxxxxxxxxxxx>,
> Linus Torvalds <torvalds@xxxxxxxx> wrote:
> |
> |
> | On Tue, 16 Dec 2003, Helge Hafting wrote:
> | >
> | > Raid-0 is ideally N times faster than a single disk, when
> | > you have N disks.
> |
> | Well, that's a _really_ "ideal" world. Ideal to the point of being
> | unrealistic.
> |
> | In most real-world situations, latency is at least as important as
> | throughput, and often dominates the story. At which point RAID-0 doesn't
> | improve performance one iota (it might make the seeks shorter, but since
> | seek latency tends to be dominated by things like rotational delay and
> | settle times, that's unlikely to be a really noticeable issue).
>
> Don't forget time in o/s queues, once an array get loaded that may
> dominate the mechanical latency and transfer times. If you call "access
> time" the sum of all latency between syscall and the first data
> transfer, then reading from multiple drives doesn't reliably help until
> you get the transfer time from an i/o somewhere between 2 and 4x the
> access time. So if the transfer time for a typical i/o is less than 2x
> the typical access time, gains are unlikely. If you set the stripe size
> high enough to make it likely that a typical i/o falls on a single drive
> you usually win. And when the transfer time reaches 4x the access time,
> you almost always win with a split.
>
> So if you are copying 100MB elements you probably win by spreading the
> i/o, but for more normal things it doesn't much matter.
>
> THERE'S ONE EXCEPTION: if you have a f/s type which puts the inodes at
> the beginning of the space, and you are creating and deleting a LOT of
> files, with a large stripe you will beat the snot out of one drive and
> the system will bottleneck no end. In that one case you gain by using
> small stripe size and spreading the head motion, even though the file
> i/o itself may really rot. Makes me wish for a f/s which could put the
> inodes in some distributed pattern.

If i recall correctly ext2 like ufs splits the inode table
up and puts parts of it at the beginning of each cylinder or
block group. Inode assignment being based on an allocation
rule that spreads them across the disks so the file data and
inode will be near each other. ext[23] also has

-R raid-options
Set raid-related options for the filesystem. Raid
options are comma separated, and may take an argu
ment using the equals ('=') sign. The following
options are supported:

stride=stripe-size
Configure the filesystem for a RAID
array with stripe-size filesystem
blocks per stripe.

The purpose of the stride option is so that that the inode
table pieces won't all wind up on the same disks as would
happen if stripe size aligns with block group size but be
staggered.

My recollection is that one or both of XFS and JFS store
the inode table in extents which are allocated on demand so
i would hope they also make inode and file data locality a
priority.

--
________________________________________________________________
J.W. Schultz Pegasystems Technologies
email address: jw@xxxxxxxxxx

Remember Cernan and Schmitt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: David Brownell: "Re: Handling of bounce buffers by rh_call_control"
Previous message: Linda Xie: "Re: PATCPATCH -- add unlimited name lengths support to sysfs"
In reply to: bill davidsen: "Re: raid0 slower than devices it is assembled of?"
Next in thread: Andre Hedrick: "Re: raid0 slower than devices it is assembled of?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]