Re: JFS default behavior / UTF-8 filenames

From: Dave Kleikamp
Date: Fri Feb 20 2004 - 10:04:37 EST


On Thu, 2004-02-19 at 17:47, kernel@xxxxxxxxxxxx wrote:
> While I don't really care one way or the other about the whole
> "rejecting non-UTF8 filenames" thing, trying to store 8bit strings in
> UTF2 (no such thing, is there? Is JFS UCS-2 or UTF-16?)

UCS-2 - I can't keep this stuff straight.

> seems really
> ugly. In general at least, maybe it's not so bad in JFS's case
> specifically because of there not being much sharing of JFS filesystems
> between linux and non-linux systems.
>
> But if JFS uses that "make the high byte zero and return the low byte
> only" scheme, what does it do when it encounters a UCS-2 filename that
> has a non-NUL high byte on an existing filesystem? I can't see any ways
> of dealing with this that aren't much more horribly broken than merely
> refusing to create filenames that aren't valid in the current encoding.
> If it throws the high byte away then you've made it impossible to open
> said files, and up to 256 files per character of the filename can now
> appear to have the same filename.
>
> So what does JFS do in its "throw away the high byte and store binary
> character strings in the low byte" mode? How does it deal with an
> existing filesystem that has filenames that don't conform to said rule?

With no iocharset specified, a filename with such a character will be
inaccessible. Probably the best thing for readdir to do is to
substitute a '?' and print a message to the syslog to mount the volume
with iocharset=utf8 to be able to access the file. Of course I would
limit the number of printk's to something small. I'll submit a patch to
do this.
--
David Kleikamp
IBM Linux Technology Center

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/