Re: UTF-8 and case-insensitivity

From: tridge
Date: Wed Feb 18 2004 - 17:53:32 EST

Next message: Stephen Hemminger: "kernel/microcode.c error from new 64bit code"
Previous message: Andrew Morton: "Re: Non-GPL export of invalidate_mmap_range"
In reply to: Linus Torvalds: "Re: UTF-8 and case-insensitivity"
Next in thread: Linus Torvalds: "Re: UTF-8 and case-insensitivity"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Linus,

> And I bet the performance advantages of _not_ doing native case
> insensitivity are likely to dominate hugely.

This part I just don't understand at all. The proposed changes would
be extremely cheap performance wise as you are just replacing one hash
with another, and dealing with one extra context bit in the
dcache. There is no way that this could come anywhere near the cost of
doing linear directory scans.

The hash function would be slightly more expensive (when enabled), but
not much, especially when you put in the obvious optimisation for 7
bit characters. The string comparison function in a couple of places
would also become more expensive, but once again it would only be
expensive for case-insensitive processes and benefits from the 7 bit
optimisation so that the average case will only be very slightly more
expensive than the current function.

Fair enough that you don't want to do this for code complexity
reasons, but please don't tell me it would be slower than what we have
to do now.

Try an strace of Samba trying to unlink() a non-existant file in a
large directory. It's enough to make you want to curl up and die :)

Cheers, Tridge
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Stephen Hemminger: "kernel/microcode.c error from new 64bit code"
Previous message: Andrew Morton: "Re: Non-GPL export of invalidate_mmap_range"
In reply to: Linus Torvalds: "Re: UTF-8 and case-insensitivity"
Next in thread: Linus Torvalds: "Re: UTF-8 and case-insensitivity"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]