Re: UTF-8 and case-insensitivity

From: Linus Torvalds
Date: Tue Feb 17 2004 - 10:14:38 EST




On Tue, 17 Feb 2004 tridge@xxxxxxxxx wrote:
>
> From memory, the patch added new classes of dentries to the current
> "+ve" and "-ve" dentries. It added concepts like a "-ve
> case-insensitive" dentry and a "-ve case-sensitive" dentry. It
> certainly adds more code in trying to deal with these variants, but I
> see no reason why it should be significantly computationally less
> efficient.

Yes, we could add context sensitivity to the dcache with a context
bitmask.

However, it's _not_ correct.

It assumes that there is only one way to do lower/upper case, which just
isn't true. What about different locales that have different case rules?
Your "one bit per dentry" becomes "one bit per locale per dentry". That's
just horribly hard to do.

I don't know how Windows does it, so maybe this thing is hardcoded, and
you don't even want "true" case insensitivity. How "correct" is Windows?

(And don't even bother telling me about the translation table in NTFS
volumes - I'm not interested. This would have to work on a sane filesystem
to be useful, even for samba.)

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/