Re: 2.6.24: RPC: bad TCP reclen 0x00020090 (large)

From: Tom Tucker
Date: Mon Feb 18 2008 - 16:49:42 EST



On Mon, 2008-02-18 at 04:58 -0800, Andrew Morton wrote:
> (suitable cc added)
>
> (regression)
>
> On Wed, 13 Feb 2008 17:02:53 +0300 Michael Tokarev <mjt@xxxxxxxxxx> wrote:
>
> > Hello!
> >
> > After upgrading to 2.6.24 (from .23), we're seeing ALOT
> > of messages like in $subj in dmesg:
> >
> > Feb 13 13:21:39 paltus kernel: RPC: bad TCP reclen 0x00020090 (large)
> > Feb 13 13:21:46 paltus kernel: printk: 3586 messages suppressed.
> > Feb 13 13:21:46 paltus kernel: RPC: bad TCP reclen 0x00020090 (large)
> > Feb 13 13:21:49 paltus kernel: printk: 371 messages suppressed.
> > Feb 13 13:21:49 paltus kernel: RPC: bad TCP reclen 0x00020090 (large)
> > Feb 13 13:21:55 paltus kernel: printk: 2979 messages suppressed.
> > ...
> >
> > with linux NFS server. The clients are all linux too, mostly 2.6.23
> > and some 2.6.22.
> >
> > I found the "offending" piece of code in net/sunrpc/svcsock.c,
> > in routine svc_tcp_recvfrom() with condition being:
> >
> > if (svsk->sk_reclen > serv->sv_max_mesg) ...

The problem might be that the client is setting a bit in the RPC message
length field that is meant to be interpreted and masked off by the
server -- and we're not doing it yet. My bet is that 0x20000 is the bit
we're looking for. I'll poke around...

> >
> > This happens after a server reboot. At this point, client(s) are trying
> > to perform some NFS transaction and fail, and server starts generating
> > the above messages - till I do a umount followed by mount on all clients.
> > Before, such situation (nfs server reboot) were handled transparently,
> > ie, there was nothing to do, the mount continued working just fine when
> > the server comes back online.
> >
> > Now, I'm not sure if it's really 2.6.24-specific problem or a userspace
> > problem. Some time ago we also upgraded nfs-kernel-server (Debian)
> > package, and the remount-after-nfs-server-reboot problem started to
> > occur at THAT time (and it is something to worry about as well, I just
> > had no time to deal with it); but the dmesg spamming only appeared
> > with 2.6.24.
> >
> > How to debug the issue further on from this point?
> >
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/