Re: A Great Idea (tm) about reimplementing NLS.

From: Pavel Machek
Date: Sun Jun 19 2005 - 13:54:16 EST


Hi!

> > > Ext2/3's encoding has always been utf-8. Period.
> >
> > In what way does Ext2/3 know or care about file name encoding? Doesn't
> > it just store an arbitrary 8-byte string? Couldn't someone claim that
> > from the start it was designed to use iso8859-1 just as easily as you
> > can claim it was designed to use utf-8?
>
> Because we've had this discussion^H^H^H^H^H^H^H^H^H^H^H flame war
> years ago, and despite people from Russia whining that that it took 3
> bytes to encode each Cyrillic character in UTF-8, it's where we came out.
>
> The bottom-line though is that if someone files a bug report with ext3
> because one user on the system was is creating filenames in Japanese,
> and another user on the same time-sharing system is creating filenames
> in Germany, and they fail to interoperate, and they were doing so in
> their local language, we would laugh at them --- just as people
> writing mail programs would laugh at people who complained that they
> were running into problems Just Sending 8-bits instead of using MIME,
> and could you please fix this business-critical bug?
>
> Or as more and more desktop programs start interpreting the filenames
> as UTF-8, and people with local variations get screwed, that is their
> problem, and Not Ours.

Actually the day we have rm utf-8-ed, we have a problem. Someone will
create two files that have same utf name, encoded differently, and
will be in trouble. Remember old > \* "hack"? utf-8 makes variation
possible...

If we are serious about utf-8 support in ext3, we should return
-EINVAL if someone passes non-canonical utf-8 string.

Pavel
--
teflon -- maybe it is a trademark, but it should not be.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/