Re: unicode (char as abstract data type)

Alex Belits (abelits@phobos.illtel.denver.co.us)
Sat, 18 Apr 1998 16:26:29 -0700 (PDT)


On 18 Apr 1998, Kai Henningsen wrote:

> > > Yes, it's a compromise, but it had to be.
> >
> > Charset labeling is a compromise. Unicode is a decision of
>
> Charset labelling is a non-solution.

How so? With everything else everyone but you (iso8859-1 user, I assume
by the domain name) will get a load of incompatibility and dubious
advantage of having more than two languages in one filename in
directory entry.

>
> > non-representative committee, imposed on everyone else by lazy software
> > vendors who don't want to do language specific processing, but want to
> > label their products as "internationalized".
>
> Complete and utter bullshit.
>
> First, Unicode doesn't claim to solve the language-dependant processing
> problem. Read the fucking standard. It explicitely says you still have to
> do it.

...and removes all means to do it by also claiming that labeling problem
is solved, and therefore shouldn't be used.

> What it does is enable you to do all the language-independant
> processing to text in any language. And it does that very well, far better
> than any other solution.

This is iso8859-1 user's opinion.

> Second, Unicode is the exact same character set as ISO 10646, which has
> more than enough representation from everywhere on the world.

What kind of "representation"? And why no single Unix user in Russia
uses or can use it?

> Don't lie.

I don't. Check actual facts before accusing people.

> > > Now, with 8-bit charsets being common, people living in
> > > countries where 8 bits are enough (especially ISO 8859-1 countries)
> > > are whining about the complexity of supporting more than 8 bits.
> >
> > AFAIK, people who always used more than 8 bits are not the biggest
> > proponents of Unicode either -- europeans (iso8859-1) and americans
> > (us-ascii) are.
>
> Not really. And it's only partly the "8 was always enough"; what it really
> is is "I have something that works for my language, what the fuck do I
> care that it doesn't work for most other languages". Problem is, that
> attitude is not working very well any more.

This is _exactly_ what Unicode proponents are doing -- having _their_
encoding extended to multibyte, stepping on other's people's encodings and
wondering why others don't like it.

> > > I really would hate to see Linux falling behind in this area.
> >
> > I will rather prefer handling of national alphabets to be done by people
> > who use them in their everyday life. Otherwise there will be a lot of
> > pissed off people and unusable software.
>
> Guess what? Unicode/ISO 10646 was designed by people using national
> alphabets in their everyday life.

Then why Russian part of it matches with the encoding that never was
actually used in Russia at the time of its development? Doesn't look
for me like anyone was asked.

> And pre-Unicode software generally is pretty much unuseable wrt. lots of
> national alphabets.

Most of software works with 8-bit charsets perfectly, and multibyte
charsets are supported by a lot of software separately. Unicode is not
supported by anything at the usable extent, and makes it a pain to support
anything right, so I really can't see any advantage.

--
Alex

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu