Re: [PATCH 3.2 38/52] Btrfs: fix race when listing an inode's xattrs

From: Filipe Manana
Date: Thu Nov 26 2015 - 04:38:04 EST




On 11/26/2015 12:39 AM, Ben Hutchings wrote:
> On Wed, 2015-11-25 at 23:11 +0000, Luis Henriques wrote:
>> On Tue, Nov 24, 2015 at 10:33:59PM +0000, Ben Hutchings wrote:
>>> 3.2.74-rc1 review patch. If anyone has any objections, please let me know.
>>>
>>> ------------------
>>>
>>> From: Filipe Manana <fdmanana@xxxxxxxx>
>>>
>>> commit f1cd1f0b7d1b5d4aaa5711e8f4e4898b0045cb6d upstream.
>>>
>>> When listing a inode's xattrs we have a time window where we race against
>>> a concurrent operation for adding a new hard link for our inode that makes
>>> us not return any xattr to user space. In order for this to happen, the
>>> first xattr of our inode needs to be at slot 0 of a leaf and the previous
>>> leaf must still have room for an inode ref (or extref) item, and this can
>>> happen because an inode's listxattrs callback does not lock the inode's
>>> i_mutex (nor does the VFS does it for us), but adding a hard link to an
>>> inode makes the VFS lock the inode's i_mutex before calling the inode's
>>> link callback.
>>>
>>> If we have the following leafs:
>>>
>>> Leaf X (has N items) Leaf Y
>>>
>>> [ ... (257 INODE_ITEM 0) (257 INODE_REF 256) ] [ (257 XATTR_ITEM 12345), ... ]
>>> slot N - 2 slot N - 1 slot 0
>>>
>>> The race illustrated by the following sequence diagram is possible:
>>>
>>> CPU 1 CPU 2
>>>
>>> btrfs_listxattr()
>>>
>>> searches for key (257 XATTR_ITEM 0)
>>>
>>> gets path with path->nodes[0] == leaf X
>>> and path->slots[0] == N
>>>
>>> because path->slots[0] is >=
>>> btrfs_header_nritems(leaf X), it calls
>>> btrfs_next_leaf()
>>>
>>> btrfs_next_leaf()
>>> releases the path
>>>
>>> adds key (257 INODE_REF 666)
>>> to the end of leaf X (slot N),
>>> and leaf X now has N + 1 items
>>>
>>> searches for the key (257 INODE_REF 256),
>>> with path->keep_locks == 1, because that
>>> is the last key it saw in leaf X before
>>> releasing the path
>>>
>>> ends up at leaf X again and it verifies
>>> that the key (257 INODE_REF 256) is no
>>> longer the last key in leaf X, so it
>>> returns with path->nodes[0] == leaf X
>>> and path->slots[0] == N, pointing to
>>> the new item with key (257 INODE_REF 666)
>>>
>>> btrfs_listxattr's loop iteration sees that
>>> the type of the key pointed by the path is
>>> different from the type BTRFS_XATTR_ITEM_KEY
>>> and so it breaks the loop and stops looking
>>> for more xattr items
>>> --> the application doesn't get any xattr
>>> listed for our inode
>>>
>>> So fix this by breaking the loop only if the key's type is greater than
>>> BTRFS_XATTR_ITEM_KEY and skip the current key if its type is smaller.
>>>
>>> Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx>
>>> [bwh: Backported to 3.2: s/found_key\.type/btrfs_key_type(\&found_key)/]
>>
>> Actually, in my backport to 3.16 I decided to keep the usage of
>> 'found_key.type' instead, as the usage of btrfs_key_type() has been
>> dropped with commit 962a298f3511 ("btrfs: kill the key type accessor
>> helpers").
> [...]
>
> OK, that makes sense. btrfs in 3.2 is pretty inconsistent about using
> btrfs_key_type() anyway.

Using the type field directly, instead of the accessor, is perfectly
safe (the field is an u8 so no worries about endianness conversions
unlike, other field of struct btrfs_key which are u64s).

>
> Ben.
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/