Re: [REGRESSION] 3.6-rc2 and 3.6-rc3: TCP/IP network connection hang

From: Martin Steigerwald
Date: Tue Sep 11 2012 - 05:01:17 EST


Am Donnerstag, 23. August 2012 schrieb Eric Dumazet:
> On Thu, 2012-08-23 at 22:35 +0200, Martin Steigerwald wrote:
> > Hi!
> >
> > Its a bit difficult to describe. With 3.6-rc2 and 3.6-rc3 on an Lenovo
> > ThinkPad T520 from Linus git, I get occasional network hangs:
> >
> > On for example sending a small mail via SMTP to my Debian Squeeze
> > based server via a ASUS WL-500gP Router with
> > Debian Squeeze and some 2.6.34 kernel KMail hangs.
> >
> > It just doesnÂt complete sending out the mail.
> >
> > I have seen this once with 3.6-rc2 and now also with 3.6-rc3 that I
> > tried cause it had quite some network fixes.
> >
> > Notebook: 10.0.0.10 (IP and MAC changed)
> > Gateway: 10.0.0.1 (IP and MAC changed)
> > Server: 194.150.191.11
> >
> >
> > Below is a tshark capture of such an occurence.
> >
> > I had a network hang with 3.6-rc2 with something else as well, but I do
> > not remember what it was and whether it was upload or download.
> >
> > This is upload.
> >
> >
> > I never seen this with any previous kernel upto 3.5.2 from Greg K.H. git.
> >
> >
> > merkaba:~> tshark -ni eth0
> > tshark: Lua: Error during loading:
> > [string "/usr/share/wireshark/init.lua"]:45: dofile has been disabled
> > Running as user "root" and group "root". This could be dangerous.
> > Capturing on eth0
> > 0.000000 10.0.0.10 -> 194.150.191.11 TCP 74 58915 > 25 [SYN] Seq=0 Win=14600 Len=0 MSS=1460 SACK_PERM=1
> > TSval=15189797 TSecr=0 WS=128
> > 0.025222 194.150.191.11 -> 10.0.0.10 TCP 74 25 > 58915 [SYN, ACK] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460
> > SACK_PERM=1 TSval=108848542 TSecr=15189797 WS=16
> > 0.025309 10.0.0.10 -> 194.150.191.11 TCP 66 58915 > 25 [ACK] Seq=1 Ack=1 Win=14720 Len=0 TSval=15189822
> > TSecr=108848542
> > 0.066680 194.150.191.11 -> 10.0.0.10 SMTP 116 S: 220 mail.lichtvoll.de ESMTP Postfix (Debian/GNU)
> > 0.066745 10.0.0.10 -> 194.150.191.11 TCP 66 58915 > 25 [ACK] Seq=1 Ack=51 Win=14720 Len=0 TSval=15189864
> > TSecr=108848553
> > 0.066881 10.0.0.10 -> 194.150.191.11 SMTP 89 C: EHLO merkaba.localnet
> > 0.092287 194.150.191.11 -> 10.0.0.10 TCP 66 25 > 58915 [ACK] Seq=51 Ack=24 Win=5792 Len=0 TSval=108848559
> > TSecr=15189864
> > 0.092351 194.150.191.11 -> 10.0.0.10 SMTP 206 S: 250-mail.lichtvoll.de | 250-PIPELINING | 250-SIZE 20000000 | 250-VRFY |
> > 250-ETRN | 250-STARTTLS | 250-ENHANCEDSTATUSCODES | 250-8BITMIME | 250 DSN
> > 0.092485 10.0.0.10 -> 194.150.191.11 SMTP 76 C: STARTTLS
> > 0.118043 194.150.191.11 -> 10.0.0.10 SMTP 96 S: 220 2.0.0 Ready to start TLS
> > 0.157589 10.0.0.10 -> 194.150.191.11 TCP 66 58915 > 25 [ACK] Seq=34 Ack=221 Win=15744 Len=0 TSval=15189955
> > TSecr=108848566
> > 0.166043 10.0.0.10 -> 194.150.191.11 SSL 292 Client Hello
> > 0.214300 194.150.191.11 -> 10.0.0.10 TLSv1 1510 Server Hello, Certificate, Server Key Exchange, Server Hello Done
> > 0.214389 10.0.0.10 -> 194.150.191.11 TCP 66 58915 > 25 [ACK] Seq=260 Ack=1665 Win=18688 Len=0 TSval=15190011
> > TSecr=108848589
> > 0.218072 10.0.0.10 -> 194.150.191.11 TLSv1 264 Client Key Exchange, Change Cipher Spec, Encrypted Handshake Message
> > 0.254985 194.150.191.11 -> 10.0.0.10 TLSv1 316 New Session Ticket, Change Cipher Spec, Encrypted Handshake Message
> > 0.258463 10.0.0.10 -> 194.150.191.11 TLSv1 135 Application Data
> > 0.285463 194.150.191.11 -> 10.0.0.10 TLSv1 215 Application Data
> > 0.287155 10.0.0.10 -> 194.150.191.11 TLSv1 151 Application Data
> > 0.313450 194.150.191.11 -> 10.0.0.10 TLSv1 135 Application Data
> > 0.313706 10.0.0.10 -> 194.150.191.11 TLSv1 183 Application Data
> > 0.347362 194.150.191.11 -> 10.0.0.10 TLSv1 151 Application Data
> > 0.349485 10.0.0.10 -> 194.150.191.11 TCP 1514 [TCP segment of a reassembled PDU]
> > 0.349522 10.0.0.10 -> 194.150.191.11 TLSv1 1327 Application Data
> > 0.350700 10.0.0.1 -> 10.0.0.10 ICMP 590 Destination unreachable (Fragmentation needed)
> > 0.384716 194.150.191.11 -> 10.0.0.10 TCP 78 [TCP Dup ACK 22#1] 25 > 58915 [ACK] Seq=2218 Ack=729 Win=7936 Len=0
> > TSval=108848632 TSecr=15190111 SLE=2177 SRE=3438
> > 0.392573 10.0.0.10 -> 194.150.191.11 TCP 1514 [TCP Retransmission] 58915 > 25 [ACK] Seq=729 Ack=2218 Win=24448
> > Len=1448 TSval=15190190 TSecr=108848632
> > 0.393809 10.0.0.1 -> 10.0.0.10 ICMP 590 Destination unreachable (Fragmentation needed)
> > 0.624613 10.0.0.10 -> 194.150.191.11 TCP 1514 [TCP Retransmission] 58915 > 25 [ACK] Seq=729 Ack=2218 Win=24448
> > Len=1448 TSval=15190422 TSecr=108848632
> > 0.625846 10.0.0.1 -> 10.0.0.10 ICMP 590 Destination unreachable (Fragmentation needed)
> > 1.089586 10.0.0.10 -> 194.150.191.11 TCP 1514 [TCP Retransmission] 58915 > 25 [ACK] Seq=729 Ack=2218 Win=24448
> > Len=1448 TSval=15190887 TSecr=108848632
> > 1.090836 10.0.0.1 -> 10.0.0.10 ICMP 590 Destination unreachable (Fragmentation needed)
> > 2.018584 10.0.0.10 -> 194.150.191.11 TCP 1514 [TCP Retransmission] 58915 > 25 [ACK] Seq=729 Ack=2218 Win=24448
> > Len=1448 TSval=15191816 TSecr=108848632
> > 2.019846 10.0.0.1 -> 10.0.0.10 ICMP 590 Destination unreachable (Fragmentation needed)
> > 3.878591 10.0.0.10 -> 194.150.191.11 TCP 1514 [TCP Retransmission] 58915 > 25 [ACK] Seq=729 Ack=2218 Win=24448
> > Len=1448 TSval=15193676 TSecr=108848632
> > 3.879797 10.0.0.1 -> 10.0.0.10 ICMP 590 Destination unreachable (Fragmentation needed)
> > 5.022069 10:00:00:01:aa:bb -> 10:00:00:10:cc:dd ARP 60 Who has 10.0.0.10? Tell 10.0.0.1
> > 5.022115 10:00:00:10:cc:dd -> 10:00:00:01:aa:bb ARP 42 10.0.0.10 is at 10:00:00:10:cc:dd
> > 7.594598 10.0.0.10 -> 194.150.191.11 TCP 1514 [TCP Retransmission] 58915 > 25 [ACK] Seq=729 Ack=2218 Win=24448
> > Len=1448 TSval=15197392 TSecr=108848632
> > 7.595882 10.0.0.1 -> 10.0.0.10 ICMP 590 Destination unreachable (Fragmentation needed)
> > 15.034613 10.0.0.10 -> 194.150.191.11 TCP 1514 [TCP Retransmission] 58915 > 25 [ACK] Seq=729 Ack=2218 Win=24448
> > Len=1448 TSval=15204832 TSecr=108848632
> > 15.035919 10.0.0.1 -> 10.0.0.10 ICMP 590 Destination unreachable (Fragmentation needed)
> > 29.914590 10.0.0.10 -> 194.150.191.11 TCP 1514 [TCP Retransmission] 58915 > 25 [ACK] Seq=729 Ack=2218 Win=24448
> > Len=1448 TSval=15219712 TSecr=108848632
> > 29.915903 10.0.0.1 -> 10.0.0.10 ICMP 590 Destination unreachable (Fragmentation needed)
> > 34.922706 10:00:00:10:cc:dd -> 10:00:00:01:aa:bb ARP 42 Who has 10.0.0.1? Tell 10.0.0.10
> > 34.923296 10:00:00:01:aa:bb -> 10:00:00:10:cc:dd ARP 60 10.0.0.1 is at 10:00:00:01:aa:bb
> >
> >
> > Thats it. Nothing more is happening.
> >
> >
> > Notebook:
> >
> > martin@merkaba:~> lspci -nn | grep Ethernet
> > 00:19.0 Ethernet controller [0200]: Intel Corporation 82579LM Gigabit Network Connection [8086:1502] (rev 04)
> >
> > martin@merkaba:~> cat /proc/version
> > Linux version 3.5.2-tp520 (martin@merkaba) (gcc version 4.7.1 (Debian 4.7.1-7) ) #1 SMP PREEMPT Sun Aug 19 12:39:04 CEST 2012
> >
> >
> > ASUS WL-500gP Premium:
> >
> > gayatri:~# lspci -nn
> > 00:00.0 Host bridge [0600]: Broadcom Corporation BCM4704 PCI to SB Bridge [14e4:4704] (rev 09)
> > 00:02.0 Network controller [0280]: Broadcom Corporation BCM4318 [AirForce One 54g] 802.11g Wireless LAN Controller
> > [14e4:4318] (rev 02)
> > 00:03.0 USB Controller [0c03]: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller [1106:3038] (rev 62)
> > 00:03.1 USB Controller [0c03]: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller [1106:3038] (rev 62)
> > 00:03.2 USB Controller [0c03]: VIA Technologies, Inc. USB 2.0 [1106:3104] (rev 65)
> >
> > gayatri:~# cat /proc/version
> > Linux version 2.6.34.5 (amain@amain-laptop) (gcc version 4.3.3 (GCC) ) #1 Sun Sep 26 18:20:27 CEST 2010
> >
> > (I tried to compile my own, but it didnÂt work out.)
> >
> > The ethernet seems to be missing from above. I am using 100 MBit wire based
> > ethernet port. Wireless is disabled.
> >
> >
> > Server is VMware ESX on some FSC server.
> >
> > Thanks,
>
> Fix is under way :
>
> http://git.kernel.org/?p=linux/kernel/git/davem/net.git;a=commit;h=9b04f350057863d1fad1ba071e09362a1da3503e

3.6-rc5 contains this commit and network works okay again.

Its committed already, but as documentation for this list:

Reported-and-tested-by: Martin Steigerwald <martin@xxxxxxxxxxxx>

Thanks,
--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/