Re: UTF-8 and case-insensitivity

From: Marc Lehmann
Date: Wed Feb 18 2004 - 02:56:45 EST


On Wed, Feb 18, 2004 at 02:26:54PM +1100, tridge@xxxxxxxxx wrote:
> Even within UCS-2 land the case-mapping table is sparse as only some
> characters have a upper/lower mapping. In fact, there are just 636
> characters out of 64k that have an upper/lower case mapping that isn't
> the identity. That is across *all* languages that windows uses for
> UCS-2.

This is because scripts differentiating between upper and lower case are
rare exceptions in the world.

Unfortunately, commonly used exceptions, and still locale dependent.

Having a samba-helper kernel module that would contain this table (I am
confident that it's only a single table in existing versions of windows,
but maybe they improve that in future versions) could solve this problem.

I still wonder wether it ever can be made efficient, though.

--
-----==- |
----==-- _ |
---==---(_)__ __ ____ __ Marc Lehmann +--
--==---/ / _ \/ // /\ \/ / pcg@xxxxxxxx |e|
-=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE --+
The choice of a GNU generation |
|
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/