Re: [GIT PULL] bcachefs fixes for 6.15-rc4

From: H. Peter Anvin
Date: Wed Apr 30 2025 - 22:49:02 EST


On 4/25/25 12:59, Theodore Ts'o wrote:

Another use case was Valve who wanted to support Windows games that
expcted case folding to work. (Microsoft Windows; the gift that keeps
on giving...) In fact the engineer who worked on case folding was
paid by Valve to do the work.

That being said, I completely agree with Linus that case insensitivity
is a nightmare, and I don't really care about performance. The use
cases where people care about this don't have directories with a large
number of entries, and we **really** don't want to encourage more use
of case insensitive lookups. There's a reason why spent much effort
improving the CLI tools' support for case folding. It's good enough
that it works for Android and Valve, and that's fine.

[...]

Perhaps if we were going to do it all over, we might have only
supported ASCII, or ISO Latin-1, and not used Unicode at all. But
then I'm sure Valve or Android mobile handset manufacturers would be
unhappy that this might not be good enough for some country that they
want to sell into, like, say, Japan or more generally, any country
beyond US and Europe.

What we probably could do is to create our own table that didn't
support all Unicode scripts, but only the ones which are required by
Valve and Android. But that would require someone willing to do this
work on a volunteer basis, or confinuce some company to pay to do this
work. We could probably reduce the kernel size by doing this, and it
would probably make the code more maintainable. I'm just not sure
anyone thinks its worthwhile to invest more into it. In fact, I'm a
bit surprised Kent decided he wanted to add this feature into bcachefs.

Sometimes, partitioning a feature which is only needed for backwards
compatibiltiy with is in fact the right approach. And throwing good
money after bad is rarely worth it.


[Yes, I realize I'm really late to weigh in on this discussion]

It is worth noting that Microsoft has basically declared their "recommended" case folding (upcase) table to be permanently frozen (for new filesystem instances in the case where they use an on-disk translation table created at format time.) As far as I know they have never supported anything other than 1:1 conversion of BMP code points, nor normalization.

The exFAT specification enumerates the full recommended upcase table, although in a somewhat annoying format (basically a hex dump of compressed data):

https://learn.microsoft.com/en-us/windows/win32/fileio/exfat-specification

This is basically an admission that the problems involved with case folding are unsolvable, and just puts a tourniquet on the wound.

It also means that "legacy OS compatibility" is really a totally different problem than "proper Unicode normalization" and that the former far more limited in scope.

-hpa