Re: [rfc] Near-constant time directory index for Ext2

From: Davide Libenzi (
Date: Wed Feb 21 2001 - 18:52:13 EST

On 21-Feb-2001 Daniel Phillips wrote:
> "H. Peter Anvin" wrote:
>> Martin Mares wrote:
>> >
>> > > True. Note too, though, that on a filesystem (which we are, after all,
>> > > talking about), if you assume a large linear space you have to create a
>> > > file, which means you need to multiply the cost of all random-access
>> > > operations with O(log n).
>> >
>> > One could avoid this, but it would mean designing the whole filesystem in
>> > a
>> > completely different way -- merge all directories to a single gigantic
>> > hash table and use (directory ID,file name) as a key, but we were
>> > originally
>> > talking about extending ext2, so such massive changes are out of question
>> > and your log n access argument is right.
>> It would still be tricky since you have to have actual files in the
>> filesystem as well.
> Have you looked at the structure and algorithms I'm using? I would not
> call this a hash table, nor is it a btree. It's a 'hash-keyed
> uniform-depth tree'. It never needs to be rehashed (though it might be
> worthwhile compacting it at some point). It also never needs to be
> rebalanced - it's only two levels deep for up to 50 million files.
> This thing deserves a name of its own. I call it an 'htree'. The
> performance should speak for itself - 150 usec/create across 90,000
> files and still a few optmizations to go.
> Random access runs at similar speeds too, it's not just taking advantage
> of a long sequence of insertions into the same directory.
> BTW, the discussion in this thread has been very interesting, it just
> isn't entirely relevant to my patch :-)


I'm all but saying that Your algo is not good.
I use something very like to it in my mail server ( XMail ) to index mail queue
files that has a two level depth fs splitting.
The mine was only an hint to try different types of directory indexing.

- Davide

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

This archive was generated by hypermail 2b29 : Fri Feb 23 2001 - 21:00:26 EST