Re: unicode (char as abstract data type)

Albert D. Cahalan (acahalan@cs.uml.edu)
Sat, 18 Apr 1998 21:50:40 -0400 (EDT)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Previous message: Richard B. Johnson: "RAW Socket"

Raul Miller writes:
> Albert D. Cahalan <acahalan@cs.uml.edu> wrote:

>> We are stuck in a world with multiple character encodings.
>> To convert, you generally need to go through UCS2.
>> The kernel must convert for foreign filesystem support.
>
> Why? This seems like a perfect job for userspace.
> (ld preload, if nothing else).

No, we already have this in the kernel _and_ a conversion back
into some other encoding. We can't get rid of it because userspace
is not generally aware of mount point crossings and mount options.

It would be really bad if libc needed to know details of the
filesystem just to read the directory.

Assuming directory reads (writes are reverse), the minimum needed in
the kernel is a conversion _to_ UCS2 Unicode. Currently, we also
convert back to an 8-bit encoding or to UTF-8. That step can be in
libc instead, but the other half must remain in the kernel.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu

Previous message: Richard B. Johnson: "RAW Socket"