Re: nfs client hang

From: Andy Chittenden
Date: Thu Jul 29 2010 - 06:10:27 EST


On 2010-07-28 18:37, Chuck Lever wrote:
On 07/28/10 03:24 AM, Andy Chittenden wrote:
resending as it seems to have been corrupted on LKML!

The RPC client marks the socket closed. and the linger timeout is
cancelled. At this point, sk_shutdown should be set to zero, correct?
I don't see an xs_error_report() call here, which would confirm that the
socket took a trip through tcp_disconnect().
From my reading of tcp_disconnect(), it calls sk->sk_error_report(sk)
unconditionally so as there's no xs_error_report(), that surely means
the exact opposite: tcp_disconnect() wasn't called. If it's not
called, sk_shutdown is not cleared. And my revised tracing confirmed
that it was set to SEND_SHUTDOWN.
Sorry, that's what I meant above.

An xs_error_report() debugging message at that point in the log would
confirm that the socket took a trip through tcp_disconnect(). But I
don't see such a message.
I don't see how tcp_disconnect() gets called if the application does a shutdown when the state is TCP_ESTABLISHED (or a myriad of other states). It just seems to send a FIN. Should tcp_disconnect() be called? If so, how? Alternatively, I wonder whether my patch that set sk_shutdown to 0 in tcp_connect_init() is the correct fix after all.

--
Andy, BlueArc Engineering

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/