Re: A Great Idea (tm) about reimplementing NLS.

From: Bernd Eckenfels
Date: Sat Jun 18 2005 - 14:12:50 EST

In article <200506181804.21366.robin.rosenberg.lists@xxxxxxxxxx> you wrote:
> Every unicode character has exactly one UTF-8 representation.

Every unicode code point has exactly one UTF-8 representation, however there
are for a few glyphs multiple code points. And this is not only a problem
beause of homoglphys which look like/similiar, but also because of combining
characters vs. legacy characters. However thats more an issue of the user
interface (think IDN exploits).

Personally I think the on-disk filesystem format should be required to be
UTF-8, and its an open discussion if the syscalls accept UTF-8 or locale
byte encodings. Currently its a mess. We can learn from Windows here:)

