Re: dynamic hash table allocation

Peter Steiner (p.steiner@t-online.de)
Sat, 19 Jun 1999 18:59:41 +0200


In article <Pine.BSO.4.10.9906181037590.2734-100000@funky.monkey.org>, Chuck Lever wrote:

>i think that's artificial -- in real-world situations, we don't
>have that kind of control over the hash, and it would "dishonest" to
>suggest that this number meant something specific.

Of course, that's why I like the fact that the #define is hidden
somewhere in the sourcecode.

>do you have both SCSI and IDE disks of varying sizes?

Just SCSI, but 3 ext2 and 2 fat filesystems. So I can be sure the
hasnfn isn't specialized on one fs type.

>any benchmarks besides du, like bonnie, or timing a kernel build?

Bonnie might be good for several purposes but not for testing that
hashfn. Ok, here are the results anyway:

original hashfn, MBPB=2:

-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
100 3324 86.4 7582 17.4 3175 16.7 3293 84.8 7736 15.4 145.8 4.5

Buffer hashed: 47117
Longest list: 10
Bucket size : empty 1 2 3 4 5 6 7 8 >=9
Num : 6629 8310 14741 3067 8 1 5 2 3 2

Golden Ratio, MBPB=2:

-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
100 3236 84.4 7278 16.8 2869 15.1 2873 84.7 7866 15.5 150.3 4.8
100 3209 83.4 8128 18.7 2988 15.6 3064 79.2 8380 17.1 151.9 4.6

Buffer hashed: 51609
Bucket size : empty 1 2 3 4 5 6 7 8 >=9
Num : 4969 12030 9337 4846 1563 23 0 0 0 0

>>6, MBPB=2:

-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
100 3341 87.0 8590 20.6 3302 17.7 3191 82.4 8956 18.5 145.7 3.8
100 3385 88.8 10042 24.0 3303 17.3 3163 82.0 8074 13.9 153.5 4.4

Buffer hashed: 49660
Bucket size : empty 1 2 3 4 5 6 7 8 >=9
Num : 5933 10863 9715 5663 592 2 0 0 0 0

>>bh_hash_bits, MBPB=2:

-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
100 2804 73.0 7591 17.7 2981 15.5 2595 84.5 7951 14.2 148.3 5.0
100 3461 90.1 7418 17.2 3036 15.9 2889 85.4 8563 15.3 139.3 4.5

Buffer hashed: 50247
Bucket size : empty 1 2 3 4 5 6 7 8 >=9
Num : 8870 5640 10875 6713 647 12 8 2 1 0

It's hard to get a good snapshot of the bucketsizes.

And here's the kernel-build bench:

real user sys

orig. hashfn, MBPB=2: 9m31.214s 8m16.080s 0m52.510s
9m08.657s 8m16.770s 0m41.870s

>>bh_hash_bits, MBPB=2: 9m25.676s 8m17.030s 0m42.920s
9m09.503s 8m18.310s 0m41.880s

Golden Ratio, MBPB=2: 9m24.848s 8m17.300s 0m42.160s
9m08.341s 8m16.120s 0m42.620s

>>6, MBPB=2: 9m37.618s 8m14.920s 0m53.370s
9m06.211s 8m14.550s 0m43.300s

orig. hashfn, MBPB=8: 9m22.338s 8m15.340s 0m43.210s
9m07.373s 8m15.970s 0m42.750s

>>bh_hash_bits, MBPB=8: 9m28.107s 8m14.850s 0m52.410s
9m07.503s 8m15.040s 0m42.400s

Golden Ratio, MBPB=8: 9m37.053s 8m17.790s 0m52.340s
9m04.462s 8m16.270s 0m43.250s

>>6, MBPB=8: 9m21.529s 8m14.990s 0m43.150s
9m05.890s 8m16.500s 0m41.930s

Really boring...

>your tests suggest fixing the shift value at 6,

Yes, and I have a theory why this is a good thing:

Look at what a shift value of 6 actually does: it produces a gap of one
block every 64 blocks. So basically 64 blocks occupy a space of 65
buckets.

At this point I have to define 'skew'.

What is 'skew'? When reading/writing blocks linearly the original
hashfn uses each bucket every 'hashsize' blocks. A shifting hashfn
steps a little faster through the hash table, so after 'hashsize'
blocks it is already 'skew' buckets in the next run.

A >>bh_hash_bits hashfn has a constant skew of 1 while the skew of a
fixed-shift hashfn increases with the hash table size.

What's the best skew? There are two contradicting criteria.

1. A small skew increases the total cycle length of the hashfn.

2. A big skew helps putting consecutive blocks of consecutive runs into
their own bucket.

A fixed shift value of 6 results in a small skew relative to the
hashzize (1/64) and in a big skew in absolute numbers. With small hash
tables the total cycle length is the important thing. Small hash tables
have a small skew and thus a relatively long cycle. With bigger hash
tables the cycle length as well as the skew increase, so both cycle
length and skew are strictly improved. If I can show that the >>6
hashfn works for small hash tables it automatically means it works even
better for bigger hash table sizes. I tested >>6 with 4096, 8192,
16384, 32768 and 65536 buckets.

>have you examined the goodness of the page cache hash function?

No, that's a different game.

Peter

-- 
   _   x    ___
  / \_/_\_ /,--'  p.steiner@t-online.de (Peter Steiner)
  \/>'~~~~//
    \_____/   signature V0.2 alpha

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/