Re: UTF-8, OSTA-UDF [why?], Unicode, and miscellaneous gibberish

Matthias Urlichs (smurf@lap.noris.de)
26 Aug 1997 14:04:26 +0200


Hi,

erik@arbat.com (Erik Corry) writes:
>
> What encoding handles a growing, changing character set?
>
None. The Chinese have a problem here. However, they should probably solve
that problem by getting their basic new-character-building mechanisms
incorporated into Unicode somehow, instead of reinventing the square wheel.

Anyway, the kernel will use UTF-8 (or Latin-1) for file names simply
because nothing else works. Personally I'm in favor of the former because
it works for everybody.

Anything else simply won't work. I require of Linux the ability to unpack a
Japanese tar file here and have my Japanese friend be able to unpack the
thing and read the file names directly, not their ASCII manglifications.
Ditto for my Russian friend. Ditto for my Chinese friend, assuming she
first tells me where I can find the appropriate character sets.

We obvioously are NOT there yet, but this is the kernel list and anything
beyond UTF-8 file names is beyond the kernel's responsibility, so take it
somewhere else please.

> Linux has already standardised on UTF-8 for the console. The
> suggestion of converting all file systems to a single
> encoding is probably a useful one, and should probably
> available as a (default?) mount option.
>
Right!

-- 
Matthias Urlichs