Re: [linus:master] [kbuild] 3bc753c06d: xfstests.generic.454.fail

From: Christian Brauner
Date: Fri Dec 30 2022 - 10:37:44 EST


On Thu, Dec 29, 2022 at 10:55:05AM -0800, Linus Torvalds wrote:
> On Thu, Dec 29, 2022 at 12:49 AM kernel test robot
> <oliver.sang@xxxxxxxxx> wrote:
> >
> > generic/454 _check_generic_filesystem: filesystem on /dev/sda4 is inconsistent
>
> The commentary on that test is:
>
> Create xattrs with multiple keys that all appear the same
> (in unicode, anyway) but point to different values. In theory all
> Linux filesystems should allow this (filenames are a sequence of
> arbitrary bytes) even if the user implications are horrifying.
>
> and looking at the script it seems to indeed just do setfattr and
> getfattr with some unusual data (ie high bit set).
>
> Adding Ted, since this is apparently all on ext4. I guess it could be
> the vfs layer too, but it really doesn't tend to look very much at the
> xattr data, so.. Adding Christian Brauner anyway, since he's been
> working in this area for other reasons.

The test uses the user.* xattr namespace which should be unaffected by
the xattr changes we did the last few cycles.

>
> Ted, Christian - I cut down the report mercilessly. It's not really
> all that interesting, apart from the basic information of "xfstest
> generic/454 started failing consistently on ext4 at commit
> 3bc753c06dd0 ('kbuild: treat char as always unsigned')".
>
> If you think you need more, see
>
> https://lore.kernel.org/all/202212291509.704a11c9-oliver.sang@xxxxxxxxx/
>
> Also, I'm surprised this hasn't been an issue earlier - 'char' has
> always been unsigned on arm (among other architectures), so if this
> test started failing now on x86-64 due to -funsigned-char, it has
> presumably been failing on arm the whole time.
>
> I assume it's something that compares a 'char *name' by value, but the
> ones I looked at (eg xattr_find_entry() used strlen()/memcmp() which
> should be all good).
>
> Oh, I think I see one potential problem in ext4:
>
> ext4_xattr_hash_entry() is hot garbage. Lookie here:
>
> while (name_len--) {
> hash = (hash << NAME_HASH_SHIFT) ^
> (hash >> (8*sizeof(hash) - NAME_HASH_SHIFT)) ^
> *name++;
> }
>
> so that hash will now depend on the sign of that 'char *name' pointer.
>
> If that hash ever has any long-term meaning (ie saved on disk or
> exposed some other way), that would be problematic.

If the xattrs aren't storable in the inode then they are stored in a
separate block. The consist of a header and after that is an array of
struct ext4_xattr_entry entries.

Each of the entries store a hash value e_hash which is a hash of the
xattr name and xattr value.

The aforementioned header contains another h_hash field which seems to
be a hash of all e_hash fields of all xattrs.

For in-inode/ea-inode xattrs a hash for xattr name and value is stored
on disk as well but there's no header. The i_atime field contains a
checksum of only the value that is stored in the inode.

IOW, the bug might also depend on how the xattrs are stored. For
example, what xattrs might be stored in the inode depends on the inode
size that is chosen when the filesystem is created. But if I'm not
mistaken, it also depends on the block size.
So if the block size chosen for x86 differs from arm that might have an
impact on how the xattrs are stored and thus make the bug more or less
likely to appear?...