Re: UTF-8 in file systems? xfs/extfs/etc.

From: Måns Rullgård
Date: Tue Feb 10 2004 - 18:24:39 EST


jw schultz <jw@xxxxxxxxxx> writes:

> On Mon, Feb 09, 2004 at 08:32:12PM -0800, Mike Fedyk wrote:
>> On Mon, Feb 09, 2004 at 02:36:24PM +0100, Matthias Urlichs wrote:
>> > Hi, Nico Schottelius wrote:
>> >
>> > > What Linux supported filesystems support UTF-8 filenames?
>> >
>> > Filenames, to the kernel, are a sequence of 8-bit things commonly
>> > called "bytes" or "octets", excluding '/' and '\0'.
>> >
>>
>> You can have "/" in the filename also, though that could be encoded
>> somehow...
>
> You might be able to have a non-ASCII character that looks
> like / but not 0x2f.
>
> I for one do not want open("/var/tpm/diddle", O_WRONLY | O_CREAT)
> to create a file "tpm/diddle" in /var just because /var/tpm
> doesn't exist.

Just imagine all the possibilities for ambiguous file names it would
open up.

> I expect UTF-8 to have no multi-byte sequences containing NUL
> but it might be awkward if a multi-byte sequence contained
> 0x2F (/). I would hope that the committees chose to avoid
> using symbol and punctuation byte-codes for alphanumeric
> sequences.

IIRC, UTF-8 doesn't use bytes <128 in any multi-byte sequences.

--
Måns Rullgård
mru@xxxxxx

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/