Re: Regression in 5.1.20: Reading long directory fails

From: J. Bruce Fields
Date: Fri Sep 06 2019 - 10:48:40 EST


On Tue, Sep 03, 2019 at 08:50:39PM -0500, Jason L Tibbitts III wrote:
> I asked the XFS folks who mentioned that the issues with 64 bit inodes
> are old, constrained to larger filesystems than what I'm using, not an
> issue with nfsv4, and not present on anything but 32bit clients with old
> userspace.
>
> In any case, I have been experimenting a bit and somehow the issue seems
> to be related to exporting with sec=krb5i:krb5p or sec=krb5i. If I
> export with just sec=krb5p, things magically begin to work.

That's interesting!

We've occasionally had bugs that are rare corner cases in the xdr
code--e.g. if the encoded directory data hits some limit at the same
time that we reach the end of a page, and the end of the page falls at
some offset with respect to the entry we're encoding.

Something like switching between krb5i and krb5p could affect the
offsets in a way that affected the likelihood of hitting such a case.
That's one guess, anyway.

> Anyway, I hope this helps to pinpoint the problem. I now have a really
> easy way to reproduce this without having to kick people off of the
> server, and if the successes aren't just some kind of false positives
> then I guess I also have a workaround. I'm still at a loss as to why a
> revert of the readdir changes makes any difference at all here.

Those readdir changes were client-side, right? Based on that I'd been
assuming a client bug, but maybe it'd be worth getting a full packet
capture of the readdir reply to make sure it's legit. Looking at it in
wireshark should tell us quickly whether it's corrupted somehow.

--b.