Re: [RFC] [PATCH] Improve hash function used for full_name_hash()

From: David Miller
Date: Mon Jan 04 2010 - 15:46:17 EST


From: Guenter Roeck <guenter.roeck@xxxxxxxxxxxx>
Date: Mon, 04 Jan 2010 12:09:44 -0800

> Please comment on this proposed patch. It is similar but more generic than
> a previously proposed change to dev_name_hash() which tried to address
> the same problem.

Since this changes a filesystem subsystem header file and effects
parts of the kernel outside of networking, you really do need to
CC: linux-kernel at a minimum. Added...

> The hash function currently used for full_name_hash() produces a large number
> of collisions if hashed names are similar. This can cause performance problems
> if a large number of similar names exist in the kernel (e.g., if there is
> a large number of virtual interfaces).
>
> For example, when hashing "eth0" .. "eth9999" with a hash table size of 256,
> the resulting minimum hash bucket depth is 0, the maximum depth is 563,
> and the standard deviation is ~136.
>
> With this patch applied, the same test results in a minimum bucket depth
> of 37, a maximum bucket depth of 42, and a standard deviation of ~1.02.
>
> The hash factor of 41 was chosen for the following reasons:
> - The resulting standard deviation is significantly better than the standard
> deviation of the original hash function for all tested hash table sizes
> (2^x, x=4..16).
> - The hash function is simple.
> - The resulting code does not require a multiply instruction
> (tested: x86, mips, powerpc).
> - The resulting code is more efficient than the code generated for the
> original hash (x86, gcc -O2: 3 instead of 7 instructions).
> - The resulting code also works well with more random strings
> (tested with all file names in a given Linux system).
>
> Signed-off-by: Guenter Roeck <guenter.roeck@xxxxxxxxxxxx>
> ---
> include/linux/dcache.h | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/include/linux/dcache.h b/include/linux/dcache.h
> index 30b93b2..772755d 100644
> --- a/include/linux/dcache.h
> +++ b/include/linux/dcache.h
> @@ -53,7 +53,7 @@ extern struct dentry_stat_t dentry_stat;
> static inline unsigned long
> partial_name_hash(unsigned long c, unsigned long prevhash)
> {
> - return (prevhash + (c << 4) + (c >> 4)) * 11;
> + return (prevhash + c) * 41;
> }
>
> /*
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/