Re: dcache_readdir NULL inode oops

From: Jan Glauber
Date: Thu Nov 29 2018 - 14:26:33 EST


On Wed, Nov 28, 2018 at 08:08:06PM +0000, Will Deacon wrote:
> I spent some more time looking at this today...
>
> On Fri, Nov 23, 2018 at 06:05:25PM +0000, Will Deacon wrote:
> > Doing some more debugging, it looks like the usual failure case is where
> > one CPU clears the inode field in the dentry via:
> >
> > devpts_pty_kill()
> > -> d_delete() // dentry->d_lockref.count == 1
> > -> dentry_unlink_inode()
> >
> > whilst another CPU gets a pointer to the dentry via:
> >
> > sys_getdents64()
> > -> iterate_dir()
> > -> dcache_readdir()
> > -> next_positive()
> >
> > and explodes on the subsequent inode dereference when trying to pass the
> > inode number to dir_emit():
> >
> > if (!dir_emit(..., d_inode(next)->i_ino, ...))
> >
> > Indeed, the hack below triggers a warning, indicating that the inode
> > is being cleared concurrently.
> >
> > I can't work out whether the getdents64() path should hold a refcount
> > to stop d_delete() in its tracks, or whether devpts_pty_kill() shouldn't
> > be calling d_delete() like this at all.
>
> So the issue is that opening /dev/pts/ptmx creates a new pty in /dev/pts,
> which disappears when you close /dev/pts/ptmx. Consequently, when we tear
> down the dentry for the magic new file, we have to take the i_node rwsem of
> the *parent* so that concurrent path walkers don't trip over it whilst its
> being freed. I wrote a simple concurrent program to getdents(/dev/pts/) in
> one thread, whilst another opens and closes /dev/pts/ptmx: it crashes the
> kernel in seconds.

I also made a testcase and verified that your fix is fine. I also tried
replacing open-close on /dev/ptmx with mkdir-rmdir but that does not
trigger the error.

> Patch below, but I'd still like somebody else to look at this, please.

I wonder why no inode_lock on parent is needed for devpts_pty_new(), but
I'm obviously not a VFS expert... So your patch looks good to me and
clearly solves the issue.

thanks,
Jan

> Will
>
> --->8
>
> diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
> index c53814539070..50ddb95ff84c 100644
> --- a/fs/devpts/inode.c
> +++ b/fs/devpts/inode.c
> @@ -619,11 +619,17 @@ void *devpts_get_priv(struct dentry *dentry)
> */
> void devpts_pty_kill(struct dentry *dentry)
> {
> - WARN_ON_ONCE(dentry->d_sb->s_magic != DEVPTS_SUPER_MAGIC);
> + struct super_block *sb = dentry->d_sb;
> + struct dentry *parent = sb->s_root;
>
> + WARN_ON_ONCE(sb->s_magic != DEVPTS_SUPER_MAGIC);
> +
> + inode_lock(parent->d_inode);
> dentry->d_fsdata = NULL;
> drop_nlink(dentry->d_inode);
> d_delete(dentry);
> + inode_unlock(parent->d_inode);
> +
> dput(dentry); /* d_alloc_name() in devpts_pty_new() */
> }