Re: Hashing and directories

From: H. Peter Anvin (
Date: Thu Feb 22 2001 - 18:22:29 EST

Bill Crawford wrote:
> A particular reason for this, apart from filesystem efficiency,
> is to make it easier for people to find things, as it is usually
> easier to spot what you want amongst a hundred things than among
> a thousand or ten thousand.
> A couple of practical examples from work here at Netcom UK (now
> Ebone :), would be say DNS zone files or user authentication data.
> We use Solaris and NFS a lot, too, so large directories are a bad
> thing in general for us, so we tend to subdivide things using a
> very simple scheme: taking the first letter and then sometimes
> the second letter or a pair of letters from the filename. This
> actually works extremely well in practice, and as mentioned above
> provides some positive side-effects.

This is sometimes feasible, but sometimes it is a hack with painful
consequences in the form of software incompatibilites.

> I guess what I really mean is that I think Linus' strategy of
> generally optimizing for the "usual case" is a good thing. It
> is actually quite annoying in general to have that many files in
> a single directory (think \winnt\... here). So maybe it would
> be better to focus on the normal situation of, say, a few hundred
> files in a directory rather than thousands ...

Linus' strategy is to not let optimizations for uncommon cases inflict
the common case. However, I think we can make an improvement here that
will work well even on moderate-sized directories.

My main problem with the fixed-depth tree proposal is that is seems to
work well for a certain range of directory sizes, but the range seems a
bit arbitrary. The case of very small directories is also quite
important, too.


<> at work, <> in private!
"Unix gives you enough rope to shoot yourself in the foot."
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

This archive was generated by hypermail 2b29 : Fri Feb 23 2001 - 21:00:29 EST