Re: [PATCH 1/2] kernfs: dont take i_lock on inode attr read

From: Arnd Bergmann
Date: Thu Dec 29 2022 - 04:21:11 EST


On Fri, Dec 23, 2022, at 00:11, Ian Kent wrote:
> On 21/12/22 21:34, Anders Roxell wrote:
>> On 2022-10-31 12:30, Tejun Heo wrote:
>>> On Tue, Oct 18, 2022 at 10:32:42AM +0800, Ian Kent wrote:
>>>> The kernfs write lock is held when the kernfs node inode attributes
>>>> are updated. Therefore, when either kernfs_iop_getattr() or
>>>> kernfs_iop_permission() are called the kernfs node inode attributes
>>>> won't change.
>>>>
>>>> Consequently concurrent kernfs_refresh_inode() calls always copy the
>>>> same values from the kernfs node.
>>>>
>>>> So there's no need to take the inode i_lock to get consistent values
>>>> for generic_fillattr() and generic_permission(), the kernfs read lock
>>>> is sufficient.
>>>>
>>>> Signed-off-by: Ian Kent <raven@xxxxxxxxxx>
>>> Acked-by: Tejun Heo <tj@xxxxxxxxxx>
>> Hi,
>>
>> Building an allmodconfig arm64 kernel on yesterdays next-20221220 and
>> booting that in qemu I see the following "BUG: KCSAN: data-race in
>> set_nlink / set_nlink".
>
>
> I'll check if I missed any places where set_link() could be
> called where the link count could be different.
>
>
> If there aren't any the question will then be can writing the
> same value to this location in multiple concurrent threads
> corrupt it?

I think the race that is getting reported for set_nlink()
is about this bit getting called simulatenously on multiple
CPUs with only the read lock held for the inode:

/* Yes, some filesystems do change nlink from zero to one */
if (inode->i_nlink == 0)
atomic_long_dec(&inode->i_sb->s_remove_count);
inode->__i_nlink = nlink;

Since i_nlink and __i_nlink refer to the same memory location,
the 'inode->i_nlink == 0' check can be true for all of them
before the nonzero nlink value gets set, and this results in
s_remove_count being decremented more than once.

Arnd