Re: unicode (char as abstract data type)

NIIBE Yutaka (gniibe@mri.co.jp)
Mon, 20 Apr 1998 11:50:04 +0900


Hello Albert,

Albert D. Cahalan writes:
> 1. libc gets UCS2 directory listing from the kernel
> 2. ls (both of them) get KOI-8 from libc
> 3. head and the shell get KOI-8 -- it's in userspace
> 4. libc gets KOI-8 from the shell
> 5. the kernel gets UCS2 from libc

Yes, this is one solution. The point is: Kernel doesn't matter much.

However, please don't define the kernel string as "UCS2 encoded in
UTF-8". I think that it's good for kernel not to define (except '\0'
and '/'), but to leave the interpretation in higher layer. (The
console driver is exception.) Please don't do that at least in near
future.

Actually, in the current implementation, the kernel string can be ISO
8859-1, KOI-8, EUC-JP (Japanese), or whatever encoding which doesn't
use '\0' and '/'. It's not good enough in terms of standardization,
but it works well for many people.

When the technology for handling of multilingual text get mature, it
will be kernel issue. But not for now, I think.

Thanks,

-- 
NIIBE Yutaka

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu