Re: Filesystem optimization..

Eric W. Biederman (ebiederm+eric@npwt.net)
29 Dec 1997 06:05:04 -0600


>>>>> "MR" == Michael O'Reilly <michael@metal.iinet.net.au> writes:

MR> alan@lxorguk.ukuu.org.uk (Alan Cox) writes:
>> > The change should dramatically improve path lookups (the inode for the
>> > next directory segment is inline with the name, so saves you a seek
>> > per segment), random open()'s (saves you a seek per open), find(1)'s
>> > all over, etc etc.
>>
>> If you have plenty of memory the odds of the directory not being in
>> memory when you fetch the inodes are pretty low IMHO, especially as the
>> directory will be readahead.

MR> Even in this, there's still a win from not needing to allocate a fixed
MR> amount of inodes.

And again see btree based filesystems. There is reiserfs in the
works, as well as my own shmfs filesystem (though because it has
different prioirties, it doesn't yet keep all inodes in the btree) but
basically with such a beast it is possible, to keep inodes in the
directory tree.

MR> In practise, on large server, it's rare to get a very high level of
MR> cache hits (3 million file filesystem would need 384K of ram just to
MR> hold the inode tables in the best case, ignoring all the directories,
MR> the other meta-data, and the on-going disk activity).

Perhaps the directory cache is too small for your machine?

>> > Comments?? (people dying to implement such a beast? :)
>>
>> Are you sure its not in fact the time taken to walk directories that is
>> doing the damage, especially if they are big directories. If so then a
>> btree directory format makes more sense.

MR> My example case has less than 100 entries per directory. (LOTS of
MR> directories tho).

Sounds like a case of a too small directory cache. ext2 has some
fairly slow directory routines, which I notice whenever I do an ls in
a the usr/X11R6/man/man3 directory where all of the filenames are too
large for the cache. It takes forever in part because I run zlibc
which stats them all, etc.

On shmfs I get a time of 2.79 seconds elapsed verses 10.17 seconds
elapsed for ext2 to display the colorized listing. And all of my
directores pages are at most 1/2 full!

Eric