Re: Regression, bisected: sqlite locking failure on nfs

From: Trond Myklebust
Date: Mon Nov 01 2010 - 16:35:48 EST


On Mon, 2010-11-01 at 15:55 -0400, Chuck Lever wrote:
> On Nov 1, 2010, at 3:48 PM, Trond Myklebust wrote:
>
> > On Mon, 2010-11-01 at 15:45 -0400, Chuck Lever wrote:
> >> On Nov 1, 2010, at 3:42 PM, Trond Myklebust wrote:
> >>
> >>> On Mon, 2010-11-01 at 15:22 -0400, Trond Myklebust wrote:
> >>>> I suspect nlmclnt_lookup_host() is to blame. It appears to be the _only_
> >>>> thing in the kernel that actually sets this 'srcaddr' field, and it sets
> >>>> it to
> >>>>
> >>>> const struct sockaddr source = {
> >>>> .sa_family = AF_UNSPEC,
> >>>> };
> >>>>
> >>>> You triggered the bug by removing the line
> >>>>
> >>>> transport->srcaddr.ss_family = family;
> >>>>
> >>>> from xs_create_sock().
> >>>>
> >>>> Trond
> >>>
> >>> Does this fix the regression?
> >>>
> >>> Trond
> >>>
> >>> ----------------------------------------------------------------------------------------------
> >>> NLM: Fix a regression in lockd
> >>>
> >>> From: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx>
> >>>
> >>> Nick Bowler reports:
> >>> There are no unusual messages on the client... but I just logged into
> >>> the server and I see lots of messages of the following form:
> >>>
> >>> nfsd: request from insecure port (192.168.8.199:35766)!
> >>> nfsd: request from insecure port (192.168.8.199:35766)!
> >>> nfsd: request from insecure port (192.168.8.199:35766)!
> >>> nfsd: request from insecure port (192.168.8.199:35766)!
> >>> nfsd: request from insecure port (192.168.8.199:35766)!
> >>>
> >>> Bisected to commit 9247685088398cf21bcb513bd2832b4cd42516c4 (SUNRPC:
> >>> Properly initialize sock_xprt.srcaddr in all cases)
> >>>
> >>> Apparently, removing the 'transport->srcaddr.ss_family = family' from
> >>> xs_create_sock() triggers this due to nlmclnt_lookup_host() incorrectly
> >>> initialising the srcaddr family to AF_UNSPEC.
> >>>
> >>> Reported-by: Nick Bowler <nbowler@xxxxxxxxxxxxxxxx>
> >>> Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx>
> >>> ---
> >>>
> >>> fs/lockd/host.c | 5 -----
> >>> 1 files changed, 0 insertions(+), 5 deletions(-)
> >>>
> >>>
> >>> diff --git a/fs/lockd/host.c b/fs/lockd/host.c
> >>> index 25e21e4..9ff0c0e 100644
> >>> --- a/fs/lockd/host.c
> >>> +++ b/fs/lockd/host.c
> >>> @@ -238,9 +238,6 @@ struct nlm_host *nlmclnt_lookup_host(const struct sockaddr *sap,
> >>> const char *hostname,
> >>> int noresvport)
> >>> {
> >>> - const struct sockaddr source = {
> >>> - .sa_family = AF_UNSPEC,
> >>> - };
> >>> struct nlm_lookup_host_info ni = {
> >>> .server = 0,
> >>> .sap = sap,
> >>> @@ -249,8 +246,6 @@ struct nlm_host *nlmclnt_lookup_host(const struct sockaddr *sap,
> >>> .version = version,
> >>> .hostname = hostname,
> >>> .hostname_len = strlen(hostname),
> >>> - .src_sap = &source,
> >>> - .src_len = sizeof(source),
> >>> .noresvport = noresvport,
> >>> };
> >>>
> >>>
> >>
> >>
> >> What about that memcpy() in nlm_lookup_host()? With this patch, wouldn't it be copying garbage into the host's srcaddr field?
> >>
> >
> > It shouldn't. ni->src_len is now zero.
>
> Yech. All this still assumes that ANYADDR is all zeroes, and that the memory this is going into is already initialized to zeroes. It's asking for trouble if we re-arrange all this someday.
>
> I've got an untested one line patch that should fix this for any upper layer caller. Posting now.
>

No! Upper layer callers should simply not be setting .saddress. Pretty
much the only thing that _should_ be setting .saddress is the lockd
callback, and possibly the nfsv4 server callback.

Trond
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/