The continuing Saga of the TCP Stall

Richard B. Johnson (root@analogic.com)
Mon, 25 Aug 1997 21:54:01 -0400 (EDT)


It's me again! I am continually plagued with long waits for responses from
keyboard commands when logged onto my system(s) from my Linux machine at
home. Sometimes, it takes as much as 30 seconds to receive the echo of
a single character.

This only occurs when I am logged onto a Linux machine. I can communicate
with Sun and SGI Machines with no problem at all even though I am using
`rlogin` from a Linux machine at home.

Any attempt to use an editor from home is horrible. I have found that
it's a lot easier to use an editor on my Sun (from home), then send
the edited stuff to my Linux machines. The Sun is at work, through the
exact same PPP link and exact same Linux machines routing packets!

I have found something interesting about this although I don't know
what it means.

Here is a truncated `tcpdump` of the ppp interface at my remote node
while ftp data are being sent from a Sun. This runs fine with no evidence
of a stall. As a matter of fact, the received data LED on my modem is
on fairly constantly, showing that the pipeline is kept full.

>From a Sun to Linux Version 2.0.12

The Sun sends one packet of 1460 bytes and receives one ACK per packet.

hal.ftp-data > groveland.1062: tcp 1460 (DF) (ttl 253, id 38172)
groveland.1062 > hal.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 3270)
hal.ftp-data > groveland.1062: tcp 1460 (DF) (ttl 253, id 38173)
groveland.1062 > hal.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 3271)
0:0:ff:1:2d:d9 45:c0:0:c6:a1:e8 184:
hal.ftp-data > groveland.1062: tcp 1460 (DF) (ttl 253, id 38174)
hal.ftp-data > groveland.1062: tcp 1460 (DF) (ttl 253, id 38175)
groveland.1062 > hal.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 3272)
hal.ftp-data > groveland.1062: tcp 1460 (DF) (ttl 253, id 38176)
groveland.1062 > hal.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 3273)
hal.ftp-data > groveland.1062: tcp 1460 (DF) (ttl 253, id 38177)
groveland.1062 > hal.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 3274)
hal.ftp-data > groveland.1062: tcp 1460 (DF) (ttl 253, id 38178)
groveland.1062 > hal.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 3275)
hal.ftp-data > groveland.1062: tcp 1460 (DF) (ttl 253, id 38179)
groveland.1062 > hal.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 3276)
hal.ftp-data > groveland.1062: tcp 1460 (DF) (ttl 253, id 38180)
hal.ftp-data > groveland.1062: tcp 1460 (DF) (ttl 253, id 38181)
groveland.1062 > hal.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 3277)
hal.ftp-data > groveland.1062: tcp 1460 (DF) (ttl 253, id 38182)
groveland.1062 > hal.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 3278)
hal.ftp-data > groveland.1062: tcp 1460 (DF) (ttl 253, id 38183)
groveland.1062 > hal.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 3279)
hal.ftp-data > groveland.1062: tcp 1460 (DF) (ttl 253, id 38184)
groveland.1062 > hal.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 3280)
hal.ftp-data > groveland.1062: tcp 1460 (DF) (ttl 253, id 38186)
groveland.1062 > hal.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 3281)
hal.ftp-data > groveland.1062: tcp 1460 (DF) (ttl 253, id 38187)
groveland.1062 > hal.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 3282)
hal.ftp-data > groveland.1062: tcp 1460 (DF) (ttl 253, id 38188)
groveland.1062 > hal.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 3283)
hal.ftp-data > groveland.1062: tcp 1460 (DF) (ttl 253, id 38189)
groveland.1062 > hal.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 3284)
hal.ftp-data > groveland.1062: tcp 892 (DF) (ttl 253, id 38190)
hal.ftp-data > groveland.1062: tcp 1460 (DF) (ttl 253, id 38191)

Here is the exact same data being sent from Linux Version 2.1.51 to the
exact same Linux version 2.0.12

The dump, using the exact same command, shows strange information!
The command `tcpdump -t -vv -q -i ppp0` was used for __both__. Tcpdump
is apparently unable to understand the header. It looks as though
I am receiving two packets of 1486 bytes in length, but only one is
being ACKed. Also, I am receiving 1486 bytes per IP Packet. This will
exceed the 1500 byte MTU (Yes/No)?

40:0:3e:6:39:28 45:8:5:dc:c:96 1486:
groveland.1066 > chaos.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 4161)
40:0:3e:6:39:27 45:8:5:dc:c:97 1486:
40:0:3e:6:39:26 45:8:5:dc:c:98 1486:
groveland.1066 > chaos.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 4162)
40:0:3e:6:39:25 45:8:5:dc:c:99 1486:
groveland.1066 > chaos.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 4163)
40:0:3e:6:39:24 45:8:5:dc:c:9a 1486:
40:0:3e:6:39:23 45:8:5:dc:c:9b 1486:
groveland.1066 > chaos.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 4164)
40:0:3e:6:39:22 45:8:5:dc:c:9c 1486:
40:0:3e:6:39:21 45:8:5:dc:c:9d 1486:
groveland.1066 > chaos.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 4165)
40:0:3e:6:39:20 45:8:5:dc:c:9e 1486:
groveland.1066 > chaos.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 4166)
40:0:3e:6:39:1f 45:8:5:dc:c:9f 1486:
40:0:3e:6:39:1e 45:8:5:dc:c:a0 1486:
groveland.1066 > chaos.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 4167)
40:0:3e:6:39:1d 45:8:5:dc:c:a1 1486:
groveland.1066 > chaos.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 4168)
40:0:3e:6:39:1c 45:8:5:dc:c:a2 1486:
40:0:3e:6:39:1b 45:8:5:dc:c:a3 1486:
groveland.1066 > chaos.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 4169)
40:0:3e:6:39:1a 45:8:5:dc:c:a4 1486:
groveland.1066 > chaos.ftp-data: tcp 0 (DF) [tos 0x8] (ttl 63, id 4170)
40:0:3e:6:39:19 45:8:5:dc:c:a5 1486:

The MTU and MRU of the ppp link is only 1500 bytes. Could this be causing
the TCP stall problems?

Cheers,
DJ
Richard B. Johnson
Analogic Corporation
Penguin : Linux version 2.1.51 on an i586 machine (66.15 BogoMips).
Warning : It's hard to stay on the trailing edge of technology.
Linux : Engineering tool
Windows : Typewriter