Re: NFS client packet storm on 2.6.27.x

From: Kasparek Tomas
Date: Thu Jun 25 2009 - 02:10:30 EST


On Wed, Apr 22, 2009 at 07:27:07PM +0200, Kasparek Tomas wrote:
> I got another client lockup today. It was a desktop so I have some more
> dmesg warnings about soft lockup caused probably by network cable unplug
> (but hopefully still showing what happens in rpciod) on
>
> http://merlin.fit.vutbr.cz/tmp/nfs/pckas-dmesg
>
> I can check with top, that rpciod was using 100% cpu. I limited the flow
> from client to server with firewall so I was able to save the server and
> get some tcpdump -s0 data (actually RPC null with ERR response from server)
>
> Just to remind, the client is 2.6.27.21 (i386), the server is 2.6.16.62
> (x86_64).

Hi, I was playing with patches from

http://www.linux-nfs.org/Linux-2.6.x/2.6.27/

and find, that

.../fixups_4/linux-2.6.27-001-respond_promptly_to_socket_errors.dif
.../fixups_4/linux-2.6.27-002-respond_promptly_to_socket_errors_2.dif

change the locking behaviour from long to endless lock to 1-2sec locks and
it seems there are fewer situations when it locks.

The packet storms does not repeat once I switched to 2.6.27.24 (and .25)
kernels so far, so it may be solved by some other patch inside .24 too.

Together with tcp_linger patch it seems to improve the situation a lot to
state when it is possible for me to use 2.6.27.x kernels.

Trond, will it be possible to get tcp_linger and the upper twho patches to
2.6.27.x stable queue so others get these fixes?

Big thanks for your help to all.

--

Tomas Kasparek, PhD student E-mail: kasparek@xxxxxxxxxxxx
CVT FIT VUT Brno, L127 Web: http://www.fit.vutbr.cz/~kasparek
Bozetechova 1, 612 66 Fax: +420 54114-1270
Brno, Czech Republic Phone: +420 54114-1220

jabber: tomas.kasparek@xxxxxxxxx
GPG: 2F1E 1AAF FD3B CFA3 1537 63BD DCBE 18FF A035 53BC

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/