Problem: Active TCP connection aborts for no reason

From: Veeriah Vijay-A19819
Date: Wed Oct 15 2003 - 17:34:02 EST


I am using 2.4.7 kernel and seem to have a problem where my TCP connection aborts
when both the client and the server are, perfectly fine, up and running and sending
data to each other. I have enabled keepalive on both the client and server side.

I was able to reproduce this problem by doing the following.
I setup the tcp parameters to be as follows :
tcp_keepalive_time = 2
tcp_keepalive_probes = 1
tcp_keepalive_intvl = 3
My server and client send data to each other approximately every 2 seconds.

>From tcpdump output I figured out the following. After a keepalive_probe is sent out (at the expiry of keepalive_time)
if it is immediately followed by some valid data, then the ack for the keepalive_probe from the remote side does
not seem to reset the probes_out to 0. Because of this when the keepalive_timer expires
the next time it RSTs the connection.

The following code in tcp_input.c, seems to be the problem
if ((prior_packets = tp->packets_out) == 0)
goto no_queue;
....
return 1;

no_queue:
tp->probes_out = 0
....

Since in my case prior_packets is not 0, because of the valid data, probes_out is not being reset.

In tcp_timer.c, the following code seems to reset the connection on the next keepalive
timer expiry,
if (!tp->keepalive_probes && tp->probes_out >= sysctl_tcp_keepalive_probes) ||
...)
{
tcp_SEnd_Active_reset (sk, GFP_ATOMIC);
tcp_write_err(sk);
goto out;
}

Can somebody let me know if I am missing something here or is it a genuine problem that
has been fixed in a later release.

I can provide the tcpdump if need be.

Thanks,
Vijay
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/