Help: Connection drops with kernel 2.4.21-15 over Broadcom NetXtreme BCM5703 gigabit cards

From: Madhu Bandireddy
Date: Mon Sep 27 2004 - 22:14:44 EST


We are seeing a large number of connection drops on our SAP
application servers with the following config:

Linux: Redhat Enterprise 3.0
Kernel: 2.4.21-15
HW: Dell 2650 with Broadcom NetXtreme BCM5703 gigabit ethernet card
CPU: 2x 2.8GHz
Memory: 12GB
Ethernet driver: Tigon 3.0

We could not find any errors on the interface itself. We are seeing
most of these connections drops from sites which are connected to our
data center over site-to-site VPN connections with high latencies.
In addition to the Linux servers in our SAP application server set we
also have 2 HP PA-RISC servers running HPUX 11.0 server. We do not see
any such connection drop problems on the HPUX servers. This makes us
believe that the problem is more related to the Linux environment.
Another thing to note is that these connection drops seem to occur
when there a large number of connections (about 500 or so) to each
server.

SAP seems to think this an OS problem. We do not see any error
messages in any of the system logs or on the ethernet interfaces. The
network itself does not show any errors.

Here is the output of netstat -s:
lnas4 Fri Sep 24 11:11:07 CDT 2004
Tcp:
27157 active connections openings
37507 passive connection openings
4 failed connection attempts
318 connection resets received
209 connections established
70267903 segments received
64033785 segments send out
49041 segments retransmited
395 bad segments received.
3269 resets sent
TcpExt:
3 packets pruned from receive queue because of socket buffer overrun
16 ICMP packets dropped because socket was locked
ArpFilter: 0
31311 TCP sockets finished time wait in fast timer
920380 delayed acks sent
206 delayed acks further delayed because of locked socket
Quick ack mode was activated 19329 times
64619380 packets directly queued to recvmsg prequeue.
337477000 packets directly received from backlog
1199298288 packets directly received from prequeue
48 packets dropped from prequeue
1503618 packets header predicted
64689963 packets header predicted and directly queued to user
TCPPureAcks: 740615
TCPHPAcks: 48292250
TCPRenoRecovery: 0
TCPSackRecovery: 675
TCPSACKReneging: 0
TCPFACKReorder: 5
TCPSACKReorder: 2
TCPRenoReorder: 0
TCPTSReorder: 0
TCPFullUndo: 0
TCPPartialUndo: 0
TCPDSACKUndo: 0
TCPLossUndo: 5
TCPLoss: 641
TCPLostRetransmit: 0
TCPRenoFailures: 0
TCPSackFailures: 2879
TCPLossFailures: 281
TCPFastRetrans: 804
TCPForwardRetrans: 11
TCPSlowStartRetrans: 30896
TCPTimeouts: 5143
TCPRenoRecoveryFail: 0
TCPSackRecoveryFail: 174
TCPSchedulerFailed: 1
TCPRcvCollapsed: 146
TCPDSACKOldSent: 3987
TCPDSACKOfoSent: 203
TCPDSACKRecv: 39
TCPDSACKOfoRecv: 0
TCPAbortOnSyn: 0
TCPAbortOnData: 10
TCPAbortOnClose: 1
TCPAbortOnMemory: 0
TCPAbortOnTimeout: 222
TCPAbortOnLinger: 0
TCPAbortFailed: 0
TCPMemoryPressures: 0

Since the problem seems to be with connections over site-to-site VPN
and are generally more pronounced during periods of high loads. I
believe this is more a tuning issue or a bug.

Any help in resolving this problem is greatly appreciated.

Thanks
Madhu
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/