Re: Heads-up: 3.6.2 / 3.6.3 NFS server oops: 3.6.2+ regression?(also an unrelated ext4 data loss bug)

From: Myklebust, Trond
Date: Tue Oct 23 2012 - 13:57:17 EST


On Tue, 2012-10-23 at 17:44 +0000, Myklebust, Trond wrote:
> You can't hold a spinlock while sleeping. Both mutex_lock() and nsm_create() can definitely sleep.
>
> The correct way to do this is to grab the spinlock and recheck the value of ln->nsm_users inside the 'if (!IS_ERR())' condition. If it is still zero, bump it and set ln->nsm_clnt, otherwise bump it, get the existing ln->nsm_clnt and call rpc_shutdown_clnt() on the redundant nsm client after dropping the spinlock.
>
> Cheers
> Trond

Can you please check if the following patch fixes the issue?

Cheers
Trond

8<--------------------------------------------------------
From 44a070455d246e09de0cefc8875833f21ca655e8 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx>
Date: Tue, 23 Oct 2012 13:51:58 -0400
Subject: [PATCH] LOCKD: fix races in nsm_client_get

Commit e9406db20fecbfcab646bad157b4cfdc7cadddfb (lockd: per-net
NSM client creation and destruction helpers introduced) contains
a nasty race on initialisation of the per-net NSM client because
it doesn't check whether or not the client is set after grabbing
the nsm_create_mutex.

Reported-by: Nix <nix@xxxxxxxxxxxxx>
Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx>
---
fs/lockd/mon.c | 20 ++++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/fs/lockd/mon.c b/fs/lockd/mon.c
index e4fb3ba..9755603 100644
--- a/fs/lockd/mon.c
+++ b/fs/lockd/mon.c
@@ -88,7 +88,7 @@ static struct rpc_clnt *nsm_create(struct net *net)
static struct rpc_clnt *nsm_client_get(struct net *net)
{
static DEFINE_MUTEX(nsm_create_mutex);
- struct rpc_clnt *clnt;
+ struct rpc_clnt *clnt, *new;
struct lockd_net *ln = net_generic(net, lockd_net_id);

spin_lock(&ln->nsm_clnt_lock);
@@ -101,11 +101,19 @@ static struct rpc_clnt *nsm_client_get(struct net *net)
spin_unlock(&ln->nsm_clnt_lock);

mutex_lock(&nsm_create_mutex);
- clnt = nsm_create(net);
- if (!IS_ERR(clnt)) {
- ln->nsm_clnt = clnt;
- smp_wmb();
- ln->nsm_users = 1;
+ new = nsm_create(net);
+ clnt = new;
+ if (!IS_ERR(new)) {
+ spin_lock(&ln->nsm_clnt_lock);
+ if (!ln->nsm_users) {
+ ln->nsm_clnt = new;
+ new = NULL;
+ }
+ clnt = ln->nsm_clnt;
+ ln->nsm_users++;
+ spin_unlock(&ln->nsm_clnt_lock);
+ if (new)
+ rpc_shutdown_client(new);
}
mutex_unlock(&nsm_create_mutex);
out:
--
1.7.11.7


--
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@xxxxxxxxxx
www.netapp.com
N‹§²æìr¸›yúèšØb²X¬¶ÇvØ^–)Þ{.nÇ+‰·¥Š{±‘êçzX§¶›¡Ü}©ž²ÆzÚ&j:+v‰¨¾«‘êçzZ+€Ê+zf£¢·hšˆ§~†­†Ûiÿûàz¹®w¥¢¸?™¨è­Ú&¢)ßf”ù^jÇy§m…á@A«a¶Úÿ 0¶ìh®å’i