Re: stat benchmark

From: Theodore Tso
Date: Sun Apr 27 2008 - 22:11:19 EST


On Mon, Apr 28, 2008 at 01:29:52AM +0200, Soeren Sandmann wrote:
> Sorting by inode is a major improvement. The numbers are less stable,
> but consistently much lower:
>
> Time to readdir(): 0.238737 s
> Time to stat 2366 files: 1.338904 s
>
> compared to
>
> Time to readdir(): 0.227599 s
> Time to stat 2366 files: 7.981752 s
>
> Of course, 1.3 seconds is still far from instant, but it may be the
> best we can get given the realities of ext3 disk layout.

Out of curiosity, what was the directory that you were stating? If it
took you 1.3 seconds to stat 2366, the directory have inodes scattered
all over the disk, or the disk must be very slow. On my laptop disk,
I can stat 9543 files in 1.1 seconds (from a Maildir directory).

Also, why does the application need to stat all of the files? Is it
just to get the file type? (i.e., regular file vs. directory) If so,
maybe you can use the d_type field in the directory entry returned by
readdir().

> I don't know if a general library outside glib would be useful. It
> seems that just telling people to "sort by inode before statting"
> would be just as effective as telling them "use this optimized
> library".

Well, the question is what we would need to do in order to make it
really easy for people to drop that into their code. Programmers are
fundamentally lazy, after all, and if it's too much work to create an
interim data structure, and then qsort it, they won't. But maybe the
glib interface is that convenient interface, and all we need to do is
change glibc to sort with a much larger chunk size.

We do need to get similar changes into find, ls, and many other
programs that might not be so interestedin linking against glibc,
though.

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/