Re: vfat: Broken case-insensitive support for UTF-8

From: OGAWA Hirofumi
Date: Mon Jan 20 2020 - 22:53:01 EST


"Theodore Y. Ts'o" <tytso@xxxxxxx> writes:

> On Mon, Jan 20, 2020 at 01:04:42PM +0900, OGAWA Hirofumi wrote:
>>
>> To be perfect, the table would have to emulate what Windows use. It can
>> be unicode standard, or something other. And other fs can use different
>> what Windows use.
>
> The big question is *which* version of Windows. vfat has been in use
> for over two decades, and vfat predates Window starting to use Unicode
> in 2001. Before that, vfat would have been using whatever code page
> its local Windows installation was set to sue; and I'm not sure if
> there was space in the FAT headers to indicate the codepage in use.
>
> It would be entertaining for someone with ancient versions of Windows
> 9x to create some floppy images using codepage 437 and 450, and then
> see what a modern Windows system does with those VFAT images --- would
> it break horibbly when it tries to interpret them as UTF-16? Or would
> it figure it out? And if so, how? Inquiring minds want to know....

Perfect encode converter have to support all versions if Windows changed
the table. However, right. Normal user would be ok with current unicode
standard, and doesn't care subtle differences. But strict custom system
will care subtle differences, it is why I'm saying *perfect*.

I'm not against to use current unicode standard. Just a noting.


BTW, VFAT has to store the both of shortname (codepage) and longname
(UTF16), and using both names to open a file. So Windows should be using
current locale codepage to make shortname even latest Windows for VFAT.

And before vfat (in linux fs driver, msdos) is using shortname
(codepage) only.

Thanks.
--
OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx>