On Sat, Feb 24, 2001 at 10:43:16AM +1300, Ralph Loader wrote:
> A while ago I did some experimentation with simple bitop based string
> hash functions. I.e., no multiplications / divides in the hash loop.
>
> The best I found was:
>
> int hash_fn (char * p)
> {
> int hash = 0;
> while (*p) {
> hash = hash + *p;
> // Rotate a 31 bit field 7 bits:
> hash = ((hash << 7)  (hash >> 24)) & 0x7fffffff;
> }
> return hash;
> }
Hmm. This one goes in the "catastrophic" category.
For actual names:
N=557398, m=51 sz=2048, min 82, max 4002, av 272.17, stddev 45122.99
For generated names:
N=557398, m=51 sz=2048, min 0, max 44800, av 272.17, stddev 10208445.83
A very nonuniform distribution.
> The rotate is equivalent to a multiplication by x**7 in Z_2[P=0],
> where P is the polynomial x**31  1 (over Z_2).
> Presumably the "best" P would be irreducible  but that would have more
> bits set in the polynomial, making reduction harder. A compromise is to
> choose P in the form x**N  1 but with relatively few factors.
> X**31  1 is such a P.
It has seven irreducible factors. Hardly "almost irreducible".
Shifting the 7bit ASCII characters over 7 bits makes sure that there
is very little interaction to start with. And the final AND that truncates
to the final size of the hash chain kills the high order bits.
No good.
Andries

To unsubscribe from this list: send the line "unsubscribe linuxkernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomoinfo.html
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Fri Feb 23 2001  21:00:31 EST