Re: [rfc] Near-constant time directory index for Ext2

From: H. Peter Anvin (
Date: Wed Feb 21 2001 - 17:38:21 EST

Martin Mares wrote:
> Hello!
> > Not true. The rehashing is O(n) and it has to be performed O(log n)
> > times during insertion. Therefore, insertion is O(log n).
> Rehashing is O(n), but the "n" is the _current_ number of items, not the
> maximum one after all the insertions.
> Let's assume you start with a single-entry hash table. You rehash for the
> first time after inserting the first item (giving hash table of size 2),
> then after the second item (=> size 4), then after the fourth item (=> size 8)
> and so on. I.e., when you insert n items, the total cost of rehashing summed
> over all the insertions is at most 1 + 2 + 4 + 8 + 16 + ... + 2^k (where
> k=floor(log2(n))) <= 2^k+1 = O(n). That is O(1) operations per item inserted.

You're right. However, for each hash table operation to be O(1) the size
of the hash table must be >> n.

I suggested at one point to use B-trees with a hash value as the key.
B-trees are extremely efficient when used on a small constant-size key.


<> at work, <> in private!
"Unix gives you enough rope to shoot yourself in the foot."
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

This archive was generated by hypermail 2b29 : Fri Feb 23 2001 - 21:00:25 EST