stty utf8

From: Jamie Lokier
Date: Mon Feb 16 2004 - 10:08:38 EST


Linus Torvalds wrote:
> People understand the problem. And UTF-8 is the solution.

Linus, I agree 100%.
My own filesystems have UTF-8 file names, of course.

There are still practical problems, two of which stand out:

1. Just because you hope a filesystem is UTF-8, does not preclude
readdir() from returning non-UTF-8 names. (These are far too
easy to create by accident).

Because of that, programs which interpret the result of
readdir() as text, yet are expected to handle any name without
silently rejecting them or aborting, are forced into strange
compromises which break basic expectations.

Spot the bug in this perl script:

perl -e 'for (glob "*") { rename $_, "Åi-".$_ or die "rename: $!\n"; }'

(NB: The prefix string is N WITH CEDILLA followed by "i-").

(Hint: it mangles perfectly fine non-ASCII file names).

Perl has no perfect behaviour to offer, because what should that
behaviour be if readdir() might return a non-UTF-8 byte sequence
as a name?


2. Terminals are not all UTF-8, and some never will be.

So when someone types something like this on a non-UTF-8
terminal, they get non-UTF-8 filename:

vi el-niÃo.txt

It isn't just a problem of display. Now you have created a
filename which isn't valid UTF-8, and GUI programs may complain,
perhaps refusing to let you select the file.

Furthermore, how exactly do you expect a user to use UTF-8 on
the filesystem when their terminal is not (or sometimes is not)
using UTF-8?


==> This problem would be very nicely solved with an additional
terminal flag. We have "stty ocrnl", "onlcr", "igncr" etc. to
translate between terminal line endings and the unix convention of
LF at the end of each line. Why not create "stty utf8" so that
non-UTF-8 terminals and UTF-8 terminals alike can work with a
Linux convention that all programs enter and display UTF-8? It
would simplify a lot of things.


-- Jamie

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/