Re: 3.18.1: broken directory with one file too many

From: J. Bruce Fields
Date: Thu Dec 18 2014 - 10:19:21 EST


On Thu, Dec 18, 2014 at 09:58:06AM -0500, Benjamin Coddington wrote:
> Frame 36 of nfs-client.pcap has this interesting string:
>
> 0ff0 00 01 3b f6 fb b6 26 16 8f 7c 00 00 00 41 62 74 ..;...&..|...Abt
> 1000 72 66 73 2d 32 30 00 00 00 00 00 00 00 00 30 36 rfs-20........06
> 1010 2d 66 69 78 2d 64 65 61 64 6c 6f 63 6b 2d 77 68 -fix-deadlock-wh
> 1020 65 6e 2d 6d 6f 75 6e 74 69 6e 67 2d 61 2d 64 65 en-mounting-a-de
> 1030 67 72 61 64 65 64 2d 66 73 2e 70 61 74 63 68 00 graded-fs.patch.

Yes, that looks like the server messing up the encoding of the reply.

Holger, what's the difference between nfs-client.pcap and
nfs-server.pcap?

--b.

>
> ...
>
> Ben
>
> On Thu, 18 Dec 2014, J. Bruce Fields wrote:
>
> > On Thu, Dec 18, 2014 at 01:22:40PM +0100, Holger HoffstÃtte wrote:
> > > On 12/17/14 22:22, J. Bruce Fields wrote:
> > > > On Tue, Dec 16, 2014 at 10:19:18PM +0000, Holger HoffstÃtte wrote:
> > > >> (..oddly broken directory over NFS..)
> > > > That doesn't sound familiar. A network trace showing the READDIR would
> > > > be really useful. Since this is so reproducible, I think that should be
> > > > possible. So do something like:
> > > >
> > > > move the problem file into 3.14/
> > > > tcpdump -s0 -wtmp.pcap -i<relevant interface>
> > > > ls the directory on the client.
> > > > kill tcpdump
> > > > send us tmp.pcap and/or take a look at it with wireshark and see
> > > > what the READDIR response looks like.
> > >
> > > Thanks for your reply. I forgot to mention that removing other files seems to "fix" the problem, so it does not seem to be spefically the new file itself that is the cause.
> > >
> > > I captured the "ls 3.14 | head" sequence on both the client and the server, and put the tcpudmp files here: http://hoho.duckdns.org/linux/ - let me know if that helped.
> >
> > On a quick skim, the server's READDIR responses look correct. The entry
> > btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
> > is returned in frame 53 (with complete reassembled reply displayed by
> > wireshark in frame 63).
> >
> > You could double-check for me--just run "wireshark nfs-server.pcap",
> > look for packets labeled "Reply ... READDIR", and expand out the READDIR
> > op and directory listing. I don't see anything obviously wrong.
> >
> > It's interesting that there's only one LOOKUP in the trace, for btrfs-20
> > (returning, not suprisingly, NFS4ERR_NOENT). If the client failed to
> > parse that entry for some reason, then maybe in addition to getting the
> > filename wrong it also failed to get the attributes, triggering the
> > extra lookup/getattr.
> >
> > > Meanwhile I'll try older/plain (unpatched) kernels. So far reverting the client to vanilla 3.18.1 or 3.14.27 has not helped..
> >
> > I'm a little unclear: when you said "All this is on freshly baked
> > 3.18.1", are you describing the client, or the server, or both?
> >
> > --b.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/