Re: [3.10.4] NFS locking panic, plus persisting NFS shutdown panicfrom 3.9.*

From: Jeff Layton
Date: Mon Aug 05 2013 - 13:37:56 EST


On Mon, 5 Aug 2013 16:15:01 +0000
"Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> wrote:

> From 3c50ba80105464a28d456d9a1e0f1d81d4af92a8 Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx>
> Date: Mon, 5 Aug 2013 12:06:12 -0400
> Subject: [PATCH] LOCKD: Don't call utsname()->nodename from
> nlmclnt_setlockargs
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> Firstly, nlmclnt_setlockargs can be called from a reclaimer thread, in
> which case we're in entirely the wrong namespace.
> Secondly, commit 8aac62706adaaf0fab02c4327761561c8bda9448 (move
> exit_task_namespaces() outside of exit_notify()) now means that
> exit_task_work() is called after exit_task_namespaces(), which
> triggers an Oops when we're freeing up the locks.
>
> Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx>
> Cc: Toralf Förster <toralf.foerster@xxxxxx>
> Cc: Oleg Nesterov <oleg@xxxxxxxxxx>
> Cc: Nix <nix@xxxxxxxxxxxxx>
> Cc: Jeff Layton <jlayton@xxxxxxxxxx>
> ---
> fs/lockd/clntproc.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
> index 9760ecb..acd3947 100644
> --- a/fs/lockd/clntproc.c
> +++ b/fs/lockd/clntproc.c
> @@ -125,14 +125,15 @@ static void nlmclnt_setlockargs(struct nlm_rqst *req, struct file_lock *fl)
> {
> struct nlm_args *argp = &req->a_args;
> struct nlm_lock *lock = &argp->lock;
> + char *nodename = req->a_host->h_rpcclnt->cl_nodename;
>
> nlmclnt_next_cookie(&argp->cookie);
> memcpy(&lock->fh, NFS_FH(file_inode(fl->fl_file)), sizeof(struct nfs_fh));
> - lock->caller = utsname()->nodename;
> + lock->caller = nodename;
> lock->oh.data = req->a_owner;
> lock->oh.len = snprintf(req->a_owner, sizeof(req->a_owner), "%u@%s",
> (unsigned int)fl->fl_u.nfs_fl.owner->pid,
> - utsname()->nodename);
> + nodename);
> lock->svid = fl->fl_u.nfs_fl.owner->pid;
> lock->fl.fl_start = fl->fl_start;
> lock->fl.fl_end = fl->fl_end;

Looks good to me...

Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>

Trond, any thoughts on the other oops that Nix posted? The issue there
seems to be that we're trying to do the pathwalk to the rpcbind unix
socket from exit_task_work(), but that's happening after we've already
called exit_fs().

The trivial answer seems to be to simply call exit_task_work() before
exit_fs() there, but it seems like we ought to be doing the upcall to
rpcbind in a mount namespace from which we know we can reach the
socket...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/