Re: vfat: Broken case-insensitive support for UTF-8

From: Al Viro
Date: Mon Jan 20 2020 - 02:30:56 EST


On Mon, Jan 20, 2020 at 01:04:42PM +0900, OGAWA Hirofumi wrote:

> Also, not directly same issue though. There is related issue for
> case-insensitive. Even if we use some sort of internal wide char
> (e.g. in nls, 16bits), dcache is holding name in user's encode
> (e.g. utf8). So inefficient to convert cached name to wide char for each
> access.
>
> Relatively recent EXT4 case-insensitive may tackled this though, I'm not
> checking it yet.

What's more, comparisons in dcache lookups have to be very careful about
the rename-related issues. You can give false negatives if the name
changes under you; it's not a problem. You can even give a false positive
in case of name change happening in the middle of comparison; ->d_seq
mismatch will get caught and it will have the result discarded before it
causes problems. However, you can't e.g. assume that the string you are
trying to convert from utf8 to 16bit won't be changing right under you.
Again, the wrong result of comparison in such situation is not a problem;
wrong return value is not the worst thing that can happen to a string
function mistakenly assuming that the string is not changing under it.

And you very much need to be careful about the things you can access
there. E.g. something like "oh, I'll just look at the flags in the
inode of the parent of potential match" (as in the recently posted
series) is a bloody bad idea on many levels. Starting with "your
potential match is getting moved right now, and what used to be its
parent becomes negative by the time you get around to fetching its
->d_inode. Dereferencing the resulting NULL to get inode flags
is not pretty".

<checks ext4>
Yup, that bug is there as well, all right. Look:
#ifdef CONFIG_UNICODE
static int ext4_d_compare(const struct dentry *dentry, unsigned int len,
const char *str, const struct qstr *name)
{
struct qstr qstr = {.name = str, .len = len };
struct inode *inode = dentry->d_parent->d_inode;

if (!IS_CASEFOLDED(inode) || !EXT4_SB(inode->i_sb)->s_encoding) {

Guess what happens if your (lockless) call of ->d_compare() runs
into the following sequence:
CPU1: ext4_d_compare() fetches ->d_parent
CPU1: takes a hardware interrupt
CPU2: dentry gets evicted by memory pressure; so is its parent, since
it was the only thing that used to keep it pinned. Eviction of the parent
calls dentry_unlink_inode() on the parent, which zeroes its ->d_inode.
CPU1: comes back
CPU1: fetches parent's ->d_inode and gets NULL
CPU1: oopses on null pointer dereference.

It's not impossible to hit. Note that e.g. vfat_cmpi() is not vulnerable
to that problem - ->d_sb is stable and both the superblock and ->nls_io
freeing is RCU-delayed.

I hadn't checked ->d_compare() instances for a while; somebody needs to
do that again, by the look of it. The above definitely is broken;
no idea how many other instaces had grown such bugs...