Re: getdents - ext4 vs btrfs performance

From: Chris Mason
Date: Fri Mar 09 2012 - 09:34:51 EST


On Fri, Mar 09, 2012 at 12:29:29PM +0100, Lukas Czerner wrote:
> Hi,
>
> I have created a simple script which creates a bunch of files with
> random names in the directory and then performs operation like list,
> tar, find, copy and remove. I have run it for ext4, xfs and btrfs with
> the 4k size files. And the result is that ext4 pretty much dominates the
> create times, tar times and find times. However copy times is a whole
> different story unfortunately - is sucks badly.
>
> Once we cross the mark of 320000 files in the directory (on my system) the
> ext4 is becoming significantly worse in copy times. And that is where
> the hash tree order in the directory entry really hit in.
>
> Here is a simple graph:
>
> http://people.redhat.com/lczerner/files/copy_benchmark.pdf
>
> Here is a data where you can play with it:
>
> https://www.google.com/fusiontables/DataSource?snapid=S425803zyTE
>
> and here is the txt file for convenience:
>
> http://people.redhat.com/lczerner/files/copy_data.txt
>
> I have also run the correlation.py from Phillip Susi on directory with
> 100000 4k files and indeed the name to block correlation in ext4 is pretty
> much random :)
>
> _ext4_
> Name to inode correlation: 0.50002499975
> Name to block correlation: 0.50002499975
> Inode to block correlation: 0.9999900001
>
> _xfs_
> Name to inode correlation: 0.969660303397
> Name to block correlation: 0.969660303397
> Inode to block correlation: 1.0
>
>
> So there definitely is a huge space for improvements in ext4.

Thanks Lukas, this is great data. There is definitely room for btrfs to
speed up in the other phases as well.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/