Re: [PATCH] Sanitize filesystem NLS handling

From: Alexander E. Patrakov
Date: Mon Mar 19 2007 - 00:38:23 EST


OGAWA Hirofumi wrote:
"Alexander E. Patrakov" <patrakov@xxxxxxxxxx> writes:

You allow to set any nls to codepage? If so, it is not good.
I did this because it involved less changes. Only FAT treats codepage as a number. All other filesystems already allow arbitrary NLS as a codepage mount parameter.

I'm saying here, it is not good for vfat.

All valid options (i.e. numbers) are still available. Prohibiting non-numbers is a lot of work (changing module_param_string to module_param_call and validating the passed string, and also changing the code for all filesystems so that they also validate manually-passed codepage string).

No, utf-8 makes completely wrong entry. It's more wrong than other nls.
For any non-UTF-8 based locales, the other NLS is correct and utf8 indeed would produce completely wrong characters. But for UTF-8 based locales, utf8 is the only correct iocharset.

No, iocharset=utf8 is wrong always.

Here we just disagree, but I think I can change your opinion by giving you a correctly configured UTF-8 Japanese system in the form of a LiveCD (note: the kernel doesn't have this patch there). If you can afford ~500 MB of downloads, please do this:

1) download http://ftp.lfs-matrix.net/pub/lfs-livecd/lfslivecd-x86-6.2-5.iso and burn it
2) write (from Windows, because it cannot be misconfigured) some files with Japanese filenames to your flash drive
3) boot the CD, type the following at the boot prompt: linux LANG=ja_JP.UTF-8 TZ=Asia/Tokyo
4) edit /etc/X11/xorg.conf (should be probably unneeded, but I ask just in case. If you have to edit xorg.conf, report a bug to me privately)
5) startx (to get a Japanese environment)
6) open the terminal
7) yes --help (are the proper Japanese characters displayed? If not, change the font in the menu to some Kochi family and report this to me as a bug - yes I know about the Japanese opposition to Unicode because of Han Unification)
8) mount -o iocharset=utf8,codepage=932 /dev/sda1 /mnt (here I assume that your flash drive appears as sda1, adjust as needed)
9) ls -l /mnt (do Japanese filenames appear correctly?)
10) umount /mnt

If you want to experiment with the more traditional ja_JP.eucjp locale, you are also welcome to do so (but then iocharset=euc-jp will be needed to show filenames correctly).

Why I can't use utf8 for jfs or something, and use other nls for vfat?

Because you'll get a mismatch between the userspace locale and the iocharset used by the kernel in at least one of these two cases. The mismatch will result in the following:

1) Your system (e.g., the "ls" command) will not correctly interpret filenames written by known-good setups.
2) Known-good setups will not correctly interpret filenames written by your system.

What probably happened in your case is that you have no known-good jfs-based setup and are the only user of your jfs disk, and thus can't see (2). It looks like you don't use UTF-8 userspace locale. Then, the on-disk filenames on your jfs filesystem are wrong, but your system misinterprets them, and two errors seem to cancel each other (but this still doesn't make it a correct setup).

--
Alexander E. Patrakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/