Re: vfat: Broken case-insensitive support for UTF-8

From: Pali RohÃr
Date: Mon Jan 20 2020 - 06:19:21 EST


On Monday 20 January 2020 00:09:31 Al Viro wrote:
> On Mon, Jan 20, 2020 at 12:33:48AM +0100, Pali RohÃr wrote:
>
> > > Does the behaviour match how Windows handles that thing?
> >
> > Linux behavior does not match Windows behavior.
> >
> > On Windows is FAT32 (fastfat.sys) case insensitive and file names "Ä"
> > and "Ä" are treated as same file. Windows does not allow you to create
> > both files. It says that file already exists.
>
> So how is the mapping specified in their implementation? That's
> obviously the mapping we have to match.

FAT specification (fatgen103.doc) is just parody for specifications.
E.g. it requires you to use pencil and paper during implementation...

About case insensitivity I found in specification these parts:

"The UNICODE name passed to the file system is converted to upper case."

"UNICODE solves the case mapping problem prevalent in some OEM code
pages by always providing a translation for lower case characters to a
single, unique upper case character."

Which basically says nothing... I can deduce from it that for mapping
table should be used Unicode standard.

But we already know that in that specifications are mistakes. And
relevant is Microsoft FAT implementation (fastfat.sys). It is now open
source on github, so we can inspect how it implements upper case
conversion.

> > > That's the only reason to support that garbage at all...
> >
> > What do you mean by garbage?
>
> Case-insensitive anything... the only reason to have that crap at all
> is that native implementations are basically forcing it as fs
> image correctness issue.

You are right. But we need to deal with it.

> It's worthless on its own merits, but
> we can't do something that amounts to corrupting fs image when
> we access it for write.

If we implement same upper case conversion as in reference
implementation (fastfat.sys) then we prevent "corrupting fs".

--
Pali RohÃr
pali.rohar@xxxxxxxxx