Re: [RFC PATCH v2 01/18] fat: Fix iocharset=utf8 mount option

From: OGAWA Hirofumi
Date: Tue Jan 10 2023 - 04:29:02 EST


Pali Rohár <pali@xxxxxxxxxx> writes:

> Currently iocharset=utf8 mount option is broken and error is printed to
> dmesg when it is used. To use UTF-8 as iocharset, it is required to use
> utf8=1 mount option.
>
> Fix iocharset=utf8 mount option to use be equivalent to the utf8=1 mount
> option and remove printing error from dmesg.

[...]

> -
> - There is also an option of doing UTF-8 translations
> - with the utf8 option.
> -
> -.. note:: ``iocharset=utf8`` is not recommended. If unsure, you should consider
> - the utf8 option instead.
> + **utf8** is supported too and recommended to use.
>
> **utf8=<bool>**
> - UTF-8 is the filesystem safe version of Unicode that
> - is used by the console. It can be enabled or disabled
> - for the filesystem with this option.
> - If 'uni_xlate' gets set, UTF-8 gets disabled.
> - By default, FAT_DEFAULT_UTF8 setting is used.
> + Alias for ``iocharset=utf8`` mount option.
>
> **uni_xlate=<bool>**
> Translate unhandled Unicode characters to special
> diff --git a/fs/fat/Kconfig b/fs/fat/Kconfig
> index 238cc55f84c4..e98aaa3bb55b 100644
> --- a/fs/fat/Kconfig
> +++ b/fs/fat/Kconfig
> @@ -93,29 +93,12 @@ config FAT_DEFAULT_IOCHARSET
> like FAT to use. It should probably match the character set
> that most of your FAT filesystems use, and can be overridden
> with the "iocharset" mount option for FAT filesystems.
> - Note that "utf8" is not recommended for FAT filesystems.
> - If unsure, you shouldn't set "utf8" here - select the next option
> - instead if you would like to use UTF-8 encoded file names by default.
> + "utf8" is supported too and recommended to use.

This patch fixes the issue of utf-8 partially only. I think we can't
still recommend only partially working one.

[...]

> - opts->utf8 = IS_ENABLED(CONFIG_FAT_DEFAULT_UTF8) && is_vfat;
> -
> if (!options)
> goto out;
>
> @@ -1318,10 +1316,14 @@ static int parse_options(struct super_block *sb, char *options, int is_vfat,
> | VFAT_SFN_CREATE_WIN95;
> break;
> case Opt_utf8_no: /* 0 or no or false */
> - opts->utf8 = 0;
> + fat_reset_iocharset(opts);

This changes the behavior of "iocharset=iso8859-1,utf8=no" for
example. Do we need this user visible change here?

> break;
> case Opt_utf8_yes: /* empty or 1 or yes or true */
> - opts->utf8 = 1;
> + fat_reset_iocharset(opts);
> + iocharset = kstrdup("utf8", GFP_KERNEL);
> + if (!iocharset)
> + return -ENOMEM;
> + opts->iocharset = iocharset;
> break;
> case Opt_uni_xl_no: /* 0 or no or false */
> opts->unicode_xlate = 0;
> @@ -1359,18 +1361,11 @@ static int parse_options(struct super_block *sb, char *options, int is_vfat,
> }
>
> out:
> - /* UTF-8 doesn't provide FAT semantics */
> - if (!strcmp(opts->iocharset, "utf8")) {
> - fat_msg(sb, KERN_WARNING, "utf8 is not a recommended IO charset"
> - " for FAT filesystems, filesystem will be "
> - "case sensitive!");
> - }
> + opts->utf8 = !strcmp(opts->iocharset, "utf8") && is_vfat;

Still broken, so I think we still need the warning here (would be
tweaked warning).

> /* If user doesn't specify allow_utime, it's initialized from dmask. */
> if (opts->allow_utime == (unsigned short)-1)
> opts->allow_utime = ~opts->fs_dmask & (S_IWGRP | S_IWOTH);
> - if (opts->unicode_xlate)
> - opts->utf8 = 0;

unicode_xlate option is exclusive with utf8, need to adjust
somewhere. (with this patch, unicode_xlate and utf8 will shows by
show_options())

> + else if (utf8)
> + return fat_utf8_strnicmp(name->name, str, alen);
> + else
> + return nls_strnicmp(t, name->name, str, alen);
> }

Not strong opinion though, maybe we better to consolidate this to a
(inline) function? (FWIW, it may be better to refactor to provide some
filename functions to hide the detail of handling nls/utf8)

Thanks.
--
OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx>