Re: [PATCH 2/2] kernfs: remove mutex in kernfs_dop_revalidate

From: Fox Chen
Date: Thu Dec 03 2020 - 01:46:04 EST


On Thu, Dec 3, 2020 at 2:46 AM Tejun Heo <tj@xxxxxxxxxx> wrote:
>
> Hello,
>
> On Wed, Dec 02, 2020 at 10:58:37PM +0800, Fox Chen wrote:
> > There is a big mutex in kernfs_dop_revalidate which slows down the
> > concurrent performance of kernfs.
> >
> > Since kernfs_dop_revalidate only does some checks, the lock is
> > largely unnecessary. Also, according to kernel filesystem locking
> > document:
> > https://www.kernel.org/doc/html/latest/filesystems/locking.html
> > locking is not in the protocal for d_revalidate operation.
>
> That's just describing the rules seen from vfs side. It doesn't say anything
> about locking rules internal to each file system implementation.

Oh, Ok, I got it.

> > This patch remove this mutex from
> > kernfs_dop_revalidate, so kernfs_dop_revalidate
> > can run concurrently.
> >
> > Signed-off-by: Fox Chen <foxhlchen@xxxxxxxxx>
> > ---
> > fs/kernfs/dir.c | 9 +++------
> > 1 file changed, 3 insertions(+), 6 deletions(-)
> >
> > diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
> > index 9aec80b9d7c6..c2267c93f546 100644
> > --- a/fs/kernfs/dir.c
> > +++ b/fs/kernfs/dir.c
> > @@ -26,7 +26,6 @@ static DEFINE_SPINLOCK(kernfs_idr_lock); /* root->ino_idr */
> >
> > static bool kernfs_active(struct kernfs_node *kn)
> > {
> > - lockdep_assert_held(&kernfs_mutex);
> > return atomic_read(&kn->active) >= 0;
> > }
> >
> > @@ -557,10 +556,9 @@ static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
> >
> > /* Always perform fresh lookup for negatives */
> > if (d_really_is_negative(dentry))
> > - goto out_bad_unlocked;
> > + goto out_bad;
> >
> > kn = kernfs_dentry_node(dentry);
> > - mutex_lock(&kernfs_mutex);
> >
> > /* The kernfs node has been deactivated */
> > if (!kernfs_active(kn))
> > @@ -579,11 +577,8 @@ static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
> > kernfs_info(dentry->d_sb)->ns != kn->ns)
> > goto out_bad;
> >
> > - mutex_unlock(&kernfs_mutex);
> > return 1;
> > out_bad:
> > - mutex_unlock(&kernfs_mutex);
> > -out_bad_unlocked:
> > return 0;
> > }
>
> I don't see how this can be safe. Nothing even protects the dentry from
> turning negative in the middle and it may end up trying to deref NULL. I'm
> sure we can make this not need kernfs_mutex but that'd have to be a lot more
> careful.
>

Sorry Tejun, I don't get it. Even before the patch

if (d_really_is_negative(dentry))
goto out_bad_unlocked;

kn = kernfs_dentry_node(dentry);
mutex_lock(&kernfs_mutex); <-------- we lock here

status of d_really_is_negative is not preserved by the mutex. It could
turn negative between we checked it and we lock kernfs_mutex.
Is it a bug here??


thanks,
fox