Re: Weird tcp performance differences with 2.0 and 2.2 kernels

Jakub Jelinek (pmonta@halibut.imedia.com)
Thu, 11 Feb 1999 22:15:07 -0800


I've also been seeing anomalous TCP performance with 2.2. I haven't
had a chance to check since 2.2.0pre6, but with that kernel TCP
transfers would reproducibly massively slow down after a while,
sending only five segments per second (exactly).

This happened with three simultaneous 100-megabit TCP connections and
also with a single gigabit TCP connection (Packet Engines G-NIC II,
hamachi.c v0.07). David Miller mentioned at the time that Jens Sorenson
had also seen it with the other flavor of gigabit card (Alteon ACEnic).
So it's perhaps unlikely to be network-driver-related.

Sorry I can't give a report relative to 2.2.2pre; I'll try to do
so over the next few days. But if you have two boxes, six
100baseT NICs, and three crossover cables, you should be able
to tell yea or nay.

Here is the 100baseT description and tcpdump. Please pardon this
posting if whatever causes it has been fixed over the past
few weeks.

Cheers,
Peter Monta pmonta@imedia.com
Imedia Corp.

----------------------------------------
The symptom: a short time after three large TCP transfers are started,
one of them gets wedged, making only very slow progress (one segment
every 0.2 sec). This happens roughly every other time the test
scripts are run (described below). Never happens with 2.0.36. It
also never happens when only two TCP transfers are running (perhaps
because they are then both able to drive the interfaces at wire
speed).

The setup is two PII/350 boxes each with three 100baseT Tulip boards;
three crossover cables connect each NIC to its sibling on the other
machine. (The aim is to run a few semi-realistic benchmarks before
some gigabit-Ethernet cards arrive.) The boards are Kingston KNE100TX's,
all with genuine DEC 21140-AF controllers.

I have tried driver version 0.89H (in stock 2.2.0pre6) and version
0.90; same problem with both. All kernels are non-SMP.

All boards are detected okay; machine A configures them

ifconfig eth0 10.0.0.1
ifconfig eth1 11.0.0.1
ifconfig eth2 12.0.0.1

and machine B

ifconfig eth0 10.0.0.2
ifconfig eth1 11.0.0.2
ifconfig eth2 12.0.0.2

Then A runs this tcp-receiver script:

ttcp -r -s -p 5001 -n 16000 &
ttcp -r -s -p 5002 -n 16000 &
ttcp -r -s -p 5003 -n 16000 &

and then B runs the tcp-transmitter:

ttcp -t -s -p 5001 -n 16000 10.0.0.1 &
ttcp -t -s -p 5002 -n 16000 11.0.0.1 &
ttcp -t -s -p 5003 -n 16000 12.0.0.1 &

A typical normal output (on the receiver machine) looks like

ttcp-r: buflen=8192, nbuf=16000, align=16384/0, port=5001 tcp
ttcp-r: socket
ttcp-r: buflen=8192, nbuf=16000, align=16384/0, port=5002 tcp
ttcp-r: socket
ttcp-r: buflen=8192, nbuf=16000, align=16384/0, port=5003 tcp
ttcp-r: socket
ttcp-r: accept from 10.0.0.2
ttcp-r: accept from 11.0.0.2
ttcp-r: accept from 12.0.0.2
ttcp-r: 131072000 bytes in 12.25 real seconds = 10451.39 KB/sec +++
ttcp-r: 30060 I/O calls, msec/call = 0.42, calls/sec = 2454.44
ttcp-r: 0.0user 3.9sys 0:12real 32% 0i+0d 0maxrss 0+2pf 0+0csw
ttcp-r: 131072000 bytes in 13.46 real seconds = 9506.95 KB/sec +++
ttcp-r: 32415 I/O calls, msec/call = 0.43, calls/sec = 2407.56
ttcp-r: 0.0user 4.2sys 0:13real 31% 0i+0d 0maxrss 0+2pf 0+0csw
ttcp-r: 131072000 bytes in 14.05 real seconds = 9108.91 KB/sec +++
ttcp-r: 33470 I/O calls, msec/call = 0.43, calls/sec = 2381.84
ttcp-r: 0.0user 4.5sys 0:14real 32% 0i+0d 0maxrss 0+2pf 0+0csw

An abnormal output simply lacks the last three lines. tcpdump on
the wedged interface shows

00:16:09.854103 11.0.0.2.1052 > 11.0.0.1.5002: . 3144381016:3144382464(1448) ack 853400968 win 32120 <nop,nop,timestamp 332534 347965> (DF)
00:16:09.854159 11.0.0.1.5002 > 11.0.0.2.1052: . ack 1448 win 31856 <nop,nop,timestamp 347985 332534> (DF)
00:16:09.854339 11.0.0.2.1052 > 11.0.0.1.5002: . 1448:2896(1448) ack 1 win 32120 <nop,nop,timestamp 332534 347985> (DF)
00:16:10.054098 11.0.0.2.1052 > 11.0.0.1.5002: . 1448:2896(1448) ack 1 win 32120 <nop,nop,timestamp 332554 347985> (DF)
00:16:10.054141 11.0.0.1.5002 > 11.0.0.2.1052: . ack 2896 win 31856 <nop,nop,timestamp 348005 332554> (DF)
00:16:10.054320 11.0.0.2.1052 > 11.0.0.1.5002: . 2896:4344(1448) ack 1 win 32120 <nop,nop,timestamp 332554 348005> (DF)
00:16:10.254093 11.0.0.2.1052 > 11.0.0.1.5002: . 2896:4344(1448) ack 1 win 32120 <nop,nop,timestamp 332574 348005> (DF)
00:16:10.254128 11.0.0.1.5002 > 11.0.0.2.1052: . ack 4344 win 31856 <nop,nop,timestamp 348025 332574> (DF)
00:16:10.254306 11.0.0.2.1052 > 11.0.0.1.5002: P 4344:5792(1448) ack 1 win 32120 <nop,nop,timestamp 332574 348025> (DF)
00:16:10.454090 11.0.0.2.1052 > 11.0.0.1.5002: P 4344:5792(1448) ack 1 win 32120 <nop,nop,timestamp 332594 348025> (DF)
00:16:10.454127 11.0.0.1.5002 > 11.0.0.2.1052: . ack 5792 win 31856 <nop,nop,timestamp 348045 332594> (DF)
00:16:10.454306 11.0.0.2.1052 > 11.0.0.1.5002: . 5792:7240(1448) ack 1 win 32120 <nop,nop,timestamp 332594 348045> (DF)
00:16:10.654084 11.0.0.2.1052 > 11.0.0.1.5002: . 5792:7240(1448) ack 1 win 32120 <nop,nop,timestamp 332614 348045> (DF)
00:16:10.654124 11.0.0.1.5002 > 11.0.0.2.1052: . ack 7240 win 31856 <nop,nop,timestamp 348065 332614> (DF)
00:16:10.654302 11.0.0.2.1052 > 11.0.0.1.5002: . 7240:8688(1448) ack 1 win 32120 <nop,nop,timestamp 332614 348065> (DF)
00:16:10.854081 11.0.0.2.1052 > 11.0.0.1.5002: . 7240:8688(1448) ack 1 win 32120 <nop,nop,timestamp 332634 348065> (DF)
00:16:10.854118 11.0.0.1.5002 > 11.0.0.2.1052: . ack 8688 win 31856 <nop,nop,timestamp 348085 332634> (DF)
00:16:10.854295 11.0.0.2.1052 > 11.0.0.1.5002: . 8688:10136(1448) ack 1 win 32120 <nop,nop,timestamp 332634 348085> (DF)
00:16:11.054078 11.0.0.2.1052 > 11.0.0.1.5002: . 8688:10136(1448) ack 1 win 32120 <nop,nop,timestamp 332654 348085> (DF)
00:16:11.054115 11.0.0.1.5002 > 11.0.0.2.1052: . ack 10136 win 31856 <nop,nop,timestamp 348105 332654> (DF)
00:16:11.054294 11.0.0.2.1052 > 11.0.0.1.5002: . 10136:11584(1448) ack 1 win 32120 <nop,nop,timestamp 332654 348105> (DF)
00:16:11.254073 11.0.0.2.1052 > 11.0.0.1.5002: . 10136:11584(1448) ack 1 win 32120 <nop,nop,timestamp 332674 348105> (DF)
00:16:11.254111 11.0.0.1.5002 > 11.0.0.2.1052: . ack 11584 win 31856 <nop,nop,timestamp 348125 332674> (DF)
00:16:11.254290 11.0.0.2.1052 > 11.0.0.1.5002: . 11584:13032(1448) ack 1 win 32120 <nop,nop,timestamp 332674 348125> (DF)
00:16:11.454069 11.0.0.2.1052 > 11.0.0.1.5002: . 11584:13032(1448) ack 1 win 32120 <nop,nop,timestamp 332694 348125> (DF)
00:16:11.454105 11.0.0.1.5002 > 11.0.0.2.1052: . ack 13032 win 31856 <nop,nop,timestamp 348145 332694> (DF)
00:16:11.454285 11.0.0.2.1052 > 11.0.0.1.5002: P 13032:14480(1448) ack 1 win 32120 <nop,nop,timestamp 332694 348145> (DF)
00:16:11.654066 11.0.0.2.1052 > 11.0.0.1.5002: P 13032:14480(1448) ack 1 win 32120 <nop,nop,timestamp 332714 348145> (DF)
00:16:11.654104 11.0.0.1.5002 > 11.0.0.2.1052: . ack 14480 win 31856 <nop,nop,timestamp 348165 332714> (DF)
00:16:11.654285 11.0.0.2.1052 > 11.0.0.1.5002: . 14480:15928(1448) ack 1 win 32120 <nop,nop,timestamp 332714 348165> (DF)
00:16:11.854060 11.0.0.2.1052 > 11.0.0.1.5002: . 14480:15928(1448) ack 1 win 32120 <nop,nop,timestamp 332734 348165> (DF)
00:16:11.854096 11.0.0.1.5002 > 11.0.0.2.1052: . ack 15928 win 31856 <nop,nop,timestamp 348185 332734> (DF)
00:16:11.854273 11.0.0.2.1052 > 11.0.0.1.5002: . 15928:17376(1448) ack 1 win 32120 <nop,nop,timestamp 332734 348185> (DF)

I'm not a TCP expert, but it seems that 11.0.0.1.5002 is not hearing the
packet the first time around, so it is retransmitted 200ms later.

I'd be happy to provide any other information needed.

Cheers,
Peter Monta pmonta@imedia.com
Imedia Corp.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/