TCP throughput

Wayne J. Salamon (salamon@cmr.ncsl.nist.gov)
Mon, 01 Jun 1998 11:46:28 -0400


I've posted this to comp.os.linux.networking, and am looking
for comments, etc.

When sending over a TCP socket, if the message size is at the TCP
receiver window size, performance drops dramatically, from nearly 11
MBytes/s to 3.5 Mbytes/s over Fast Ethernet (result are similar over
ATM).

What we found was that the last TCP segment was waiting to
be sent, even though previous ACKs had opened the window up.
(The sockets had TCP_NODELAY set, so the last segment should
not have waited). I need to write this up in a better format,
but briefly, here's what happens (kernel 2.0.29):

When the message size is less than the receiver window,
tcp_sendmsg sends the last partial TCP segment, after calling
do_tcp_sendmsg (which does the segmenting), because there is
room for the partial in the receiver window, and TCP_NODELAY
is in
effect.
When the message size is at the receiver window size, then
tcp_sendmsg will not send the partial segment, because there
is no room in the receiver window.

Now, some ACKs come back from the receiver; usually the first
ACK acknowledges 6 or 7 segments. So the receiver window is
opened, and the partial packet could be sent. However, in tcp_ack,
the check for sending a partial packet does NOT include TCP_NODELAY
in effect, but does check for packets in flight (The Nagle algorithm),
and that check prevents the partial segment from being sent until
all other outstanding segments are acknowledged. This delay
results in a drop in the performance. Changing the code to
include the TCP_NODELAY check improves the performance back to
11 MBytes/s.

Under kernel 2.1.79, this problem is not as noticeable because the
ACKs come back faster from the receiver. A 2.0.29 kernel sending
to a 2.1.79 kernel will not see the drop in performance, even though
the problem is with the 2.0.29 TCP code; with the 2.1.79 receiver
doing the faster ACKs, this masks the delay of the partial segment.

I have traces and timings for this effect. I'd like some confirmation
from any Linux network persons on this problem. I'm not going to
call it a bug just now, as there may be legitimate reasons not
to check TCP_NODELAY on receiving an ACK.

WJS

-- 
----------------------------------------------------------------------
 | Wayne J. Salamon   | National Institute of Standards & Technology |
 | Computer Scientist | Gaithersburg, MD 20899                       |
 | wsalamon@nist.gov  |                                              |
----------------------------------------------------------------------

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu