Re: [rfc] Near-constant time directory index for Ext2

From: H. Peter Anvin (hpa@transmeta.com)
Date: Wed Feb 21 2001 - 17:38:21 EST


Martin Mares wrote:
>
> Hello!
>
> > Not true. The rehashing is O(n) and it has to be performed O(log n)
> > times during insertion. Therefore, insertion is O(log n).
>
> Rehashing is O(n), but the "n" is the _current_ number of items, not the
> maximum one after all the insertions.
>
> Let's assume you start with a single-entry hash table. You rehash for the
> first time after inserting the first item (giving hash table of size 2),
> then after the second item (=> size 4), then after the fourth item (=> size 8)
> and so on. I.e., when you insert n items, the total cost of rehashing summed
> over all the insertions is at most 1 + 2 + 4 + 8 + 16 + ... + 2^k (where
> k=floor(log2(n))) <= 2^k+1 = O(n). That is O(1) operations per item inserted.
>

You're right. However, for each hash table operation to be O(1) the size
of the hash table must be >> n.

I suggested at one point to use B-trees with a hash value as the key.
B-trees are extremely efficient when used on a small constant-size key.

        -hpa

-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Feb 23 2001 - 21:00:25 EST