Re: [PATCH RFC v2 4/4] 9p: fix race issue in fid contention.

From: Dominique Martinet
Date: Tue Nov 03 2020 - 05:41:35 EST


Jianyong,

Jianyong Wu wrote on Wed, Sep 23, 2020:
> Eric's and Greg's patch offer a mechanism to fix open-unlink-f*syscall
> bug in 9p. But there is race issue in fid comtention.
> As Greg's patch stores all of fids from opened files into according inode,
> so all the lookup fid ops can retrieve fid from inode preferentially. But
> there is no mechanism to handle the fid comtention issue. For example,
> there are two threads get the same fid in the same time and one of them
> clunk the fid before the other thread ready to discard the fid. In this
> scenario, it will lead to some fatal problems, even kernel core dump.
>
> I introduce a mechanism to fix this race issue. A counter field introduced
> into p9_fid struct to store the reference counter to the fid. When a fid
> is allocated from the inode or dentry, the counter will increase, and
> will decrease at the end of its occupation. It is guaranteed that the
> fid won't be clunked before the reference counter go down to 0, then
> we can avoid the clunked fid to be used.
>
> tests:
> race issue test from the old test case:
> for file in {01..50}; do touch f.${file}; done
> seq 1 1000 | xargs -n 1 -P 50 -I{} cat f.* > /dev/null
>
> open-unlink-f*syscall test:
> I have tested for f*syscall include: ftruncate fstat fchown fchmod faccessat.
>
> Signed-off-by: Jianyong Wu <jianyong.wu@xxxxxxx>

Ok so I've finally taken some time to test -- sorry for the delay.
I didn't bother with qemu but the use-after-close f* calls work with
nfs-ganesha and it doesn't introduce any obvious regression with the
current qemu either, so this is good for me.

Greg, Christian - from what I understood (in private, hopefully I'm
allowed to repeat!), he won't be able to contribute to qemu because of
company policies and I'm unlikely to take the time either right now.
I don't think it's a problem to continue as is though, we can land linux
kernel support (it's still useful for non-qemu servers) and if someone
is interested later on they'll just need to finish that bit.


I'm not seeing any obvious problem now so I'll take these patches in
-next within the next few days, with this extra patch on top that
basically applies the requests I had:
- removing the extra atomic_set in fs/9p/fid.c
- convert from atomic to refcount API (overflow checks)
- rename a no-composant walk to clone_fid()

I'll just run some more instrumented tests first to check we're not
leaking anything but so far I haven't found any problem.

If you noticed anything else please speak up.
Thanks for taking the time to finish this! :)
---
fs/9p/fid.c | 10 ++++------
fs/9p/fid.h | 2 +-
include/net/9p/client.h | 2 +-
net/9p/client.c | 4 ++--
4 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/fs/9p/fid.c b/fs/9p/fid.c
index 89643dabcdae..50118ec72a92 100644
--- a/fs/9p/fid.c
+++ b/fs/9p/fid.c
@@ -28,7 +28,6 @@

static inline void __add_fid(struct dentry *dentry, struct p9_fid *fid)
{
- atomic_set(&fid->count, 1);
hlist_add_head(&fid->dlist, (struct hlist_head *)&dentry->d_fsdata);
}

@@ -62,7 +61,7 @@ static struct p9_fid *v9fs_fid_find_inode(struct inode *inode, kuid_t uid)
}
}
if (ret && !IS_ERR(ret))
- atomic_inc(&ret->count);
+ refcount_inc(&ret->count);
spin_unlock(&inode->i_lock);
return ret;
}
@@ -77,7 +76,6 @@ static struct p9_fid *v9fs_fid_find_inode(struct inode *inode, kuid_t uid)
void v9fs_open_fid_add(struct inode *inode, struct p9_fid *fid)
{
spin_lock(&inode->i_lock);
- atomic_set(&fid->count, 1);
hlist_add_head(&fid->ilist, (struct hlist_head *)&inode->i_private);
spin_unlock(&inode->i_lock);
}
@@ -110,7 +108,7 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, kuid_t uid, int any)
hlist_for_each_entry(fid, h, dlist) {
if (any || uid_eq(fid->uid, uid)) {
ret = fid;
- atomic_inc(&ret->count);
+ refcount_inc(&ret->count);
break;
}
}
@@ -201,7 +199,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
}
/* If we are root ourself just return that */
if (dentry->d_sb->s_root == dentry) {
- atomic_inc(&fid->count);
+ refcount_inc(&fid->count);
return fid;
}
/*
@@ -250,7 +248,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
fid = ERR_PTR(-ENOENT);
} else {
__add_fid(dentry, fid);
- atomic_inc(&fid->count);
+ refcount_inc(&fid->count);
spin_unlock(&dentry->d_lock);
}
}
diff --git a/fs/9p/fid.h b/fs/9p/fid.h
index 1fed96546728..f7f33509e169 100644
--- a/fs/9p/fid.h
+++ b/fs/9p/fid.h
@@ -28,7 +28,7 @@ static inline struct p9_fid *v9fs_fid_clone(struct dentry *dentry)
if (!fid || IS_ERR(fid))
return fid;

- nfid = p9_client_walk(fid, 0, NULL, 1);
+ nfid = clone_fid(fid);
p9_client_clunk(fid);
return nfid;
}
diff --git a/include/net/9p/client.h b/include/net/9p/client.h
index 58ed9bd306bd..e1c308d8d288 100644
--- a/include/net/9p/client.h
+++ b/include/net/9p/client.h
@@ -149,7 +149,7 @@ enum fid_source {
struct p9_fid {
struct p9_client *clnt;
u32 fid;
- atomic_t count;
+ refcount_t count;
int mode;
struct p9_qid qid;
u32 iounit;
diff --git a/net/9p/client.c b/net/9p/client.c
index a6c8a915e0d8..ba4910138c5b 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -901,7 +901,7 @@ static struct p9_fid *p9_fid_create(struct p9_client *clnt)
fid->clnt = clnt;
fid->rdir = NULL;
fid->fid = 0;
- atomic_set(&fid->count, 1);
+ refcount_set(&fid->count, 1);

idr_preload(GFP_KERNEL);
spin_lock_irq(&clnt->lock);
@@ -1466,7 +1466,7 @@ int p9_client_clunk(struct p9_fid *fid)
dump_stack();
return 0;
}
- if (!atomic_dec_and_test(&fid->count))
+ if (!refcount_dec_and_test(&fid->count))
return 0;

again: