Re: PE R200 with tg3 is loosing network connection
From: Jeremy Jackson
Date: Sat Jan 31 2009 - 10:50:34 EST
There might not be much help here, but my $0.02 is to check for
motherboard BIOS update, and a firmware update for the NICs. the
broadcoms have flash inside for the ASF/IPMI firmware. It could be a
known issue.
I have some Tyan S2865 with tg3 onboard, which vendor provided a
daughter card for IPMI. I think the tg3 chip could do it by itsself, so
I tried some ASF firmwares using B57DIAG.EXE from FreeDOS... got it to
to DHCP while motherboard was "off", but the board started having
problems freezing durring POST. I gave up, since nobody supported it,
and there were no chip/firmware docs from broadcom.
It's sad the chip has so much potential that will never be unlocked.
On Sat, 2009-01-31 at 06:29 +0100, Florian Lihl - INGATE GmbH wrote:
> Hi.
>
> I'm experiencing a really strange problem with a new R200 server.
>
> I'm using the on board network adapter #1 shared for both IPMI and Linux
> but with 2 different IP addresses.
> After a couple of hours or days the IP address assigned to Debian
> becomes unreachable. No ping or network traffic (in and out) comes
> through anymore.
> But the IPMI is still working fine. After I reboot the server via IPMI
> the Debian IP address comes up until the same issue starts again.
>
> The server is running Debian etchnhalf. I tried both the Debian tg3
> kernel module and a self compiled 3.92n but it doesn't make any difference.
>
> Syslog and dmesg don't show any errors but the output of ethtool is
> pretty strange. It cannot determine link mode or speed and I also can't
> set it manually.
> Ethtool -t shows me "The test result is FAIL".
> ifconfig shows me a massive numbers of packets when the network
> connection breaks but it still says the link was up.
>
> I have a lot of R300 and PE1850 servers working fine with exactly same
> setup in that same network.
> I have no idea what's causing this behavior. Any suggestions?
>
> I'm not sure if this is a hardware or a software error but I assume it's
> a hardware issue because of the ethtool output, isn't it? How can I
> narrow that down?
>
> Any help is highly appreciated, please let me know if I missed to
> provide any important piece of information.
>
> Below comes the system information.
>
> Linux host 2.6.24-etchnhalf.1-amd64 #1 SMP Tue Dec 2 17:21:26 UTC 2008
> x86_64 GNU/Linux
>
> Dmesg output:
>
> tg3.c:v3.86 (November 9, 2007)
> eth0: Tigon3 [partno(BCM95721A211F) rev 4201 PHY(5750)] (PCI Express)
> 10/100/1000Base-T Ethernet 00:10:18:3a:85:d2
> eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] WireSpeed[1] TSOcap[1]
> eth0: dma_rwctrl[76180000] dma_mask[64-bit]
>
>
> Network cards lspci -v: (I'm using the first one)
>
> 01:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5721
> Gigabit Ethernet PCI Express (rev 21)
> Subsystem: Broadcom Corporation NetXtreme BCM5721 Gigabit
> Ethernet PCI Express
> Flags: bus master, fast devsel, latency 0, IRQ 16
> Memory at dfaf0000 (64-bit, non-prefetchable) [size=64K]
> Capabilities: [48] Power Management version 2
> Capabilities: [50] Vital Product Data
> Capabilities: [58] Message Signalled Interrupts: Mask- 64bit+
> Queue=0/3 Enable-
> Capabilities: [d0] Express Endpoint IRQ 0
> Capabilities: [100] Advanced Error Reporting
> Capabilities: [13c] Virtual Channel
> Capabilities: [160] Device Serial Number d2-85-3a-fe-ff-18-10-00
> Capabilities: [16c] Power Budgeting
>
> 03:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5721
> Gigabit Ethernet PCI Express (rev 21)
> Subsystem: Dell Unknown device 023c
> Flags: bus master, fast devsel, latency 0, IRQ 1275
> Memory at dfdf0000 (64-bit, non-prefetchable) [size=64K]
> Expansion ROM at <ignored> [disabled]
> Capabilities: [48] Power Management version 2
> Capabilities: [50] Vital Product Data
> Capabilities: [58] Message Signalled Interrupts: Mask- 64bit+
> Queue=0/3 Enable+
> Capabilities: [d0] Express Endpoint IRQ 0
> Capabilities: [100] Advanced Error Reporting
> Capabilities: [13c] Virtual Channel
> Capabilities: [160] Device Serial Number 1a-3e-f9-fe-ff-b9-19-00
> Capabilities: [16c] Power Budgeting
>
> 04:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5721
> Gigabit Ethernet PCI Express (rev 21)
> Subsystem: Dell Unknown device 023c
> Flags: bus master, fast devsel, latency 0, IRQ 17
> Memory at dfef0000 (64-bit, non-prefetchable) [size=64K]
> Capabilities: [48] Power Management version 2
> Capabilities: [50] Vital Product Data
> Capabilities: [58] Message Signalled Interrupts: Mask- 64bit+
> Queue=0/3 Enable-
> Capabilities: [d0] Express Endpoint IRQ 0
> Capabilities: [100] Advanced Error Reporting
> Capabilities: [13c] Virtual Channel
> Capabilities: [160] Device Serial Number 1b-3e-f9-fe-ff-b9-19-00
> Capabilities: [16c] Power Budgeting
>
>
> Ethtool output:
>
> ethtool eth0
> Settings for eth0:
> Supported ports: [ TP ]
> Supported link modes: 10baseT/Half 10baseT/Full
> 100baseT/Half 100baseT/Full
> 1000baseT/Half 1000baseT/Full
> Supports auto-negotiation: Yes
> Advertised link modes: 10baseT/Half 10baseT/Full
> 100baseT/Half 100baseT/Full
> 1000baseT/Half 1000baseT/Full
> Advertised auto-negotiation: Yes
> Speed: Unknown! (0)
> Duplex: Half
> Port: Twisted Pair
> PHYAD: 1
> Transceiver: internal
> Auto-negotiation: on
> Supports Wake-on: g
> Wake-on: d
> Current message level: 0x000000ff (255)
> Link detected: yes
>
> ethtool -t eth0
> The test result is FAIL
> The test extra info:
> nvram test (online) 0
> link test (online) 1
> register test (offline) 0
> memory test (offline) 0
> loopback test (offline) 3
> interrupt test (offline) 1
>
>
> ifconfig before connection problem:
>
> eth1 Link encap:Ethernet HWaddr 00:19:B9:F9:3E:1A
> inet addr:94.75.226.2 Bcast:94.75.226.63 Mask:255.255.255.192
> inet6 addr: fe80::219:b9ff:fef9:3e1a/64 Scope:Link
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> RX packets:154458 errors:0 dropped:0 overruns:0 frame:0
> TX packets:47746 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:1000
> RX bytes:18468788 (17.6 MiB) TX bytes:6482395 (6.1 MiB)
> Interrupt:16
>
>
> ifconfig when connection problem just started:
>
> eth1 Link encap:Ethernet HWaddr 00:19:B9:F9:3E:1A
> inet addr:94.75.226.2 Bcast:94.75.226.63 Mask:255.255.255.192
> inet6 addr: fe80::219:b9ff:fef9:3e1a/64 Scope:Link
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> RX packets:1662152497851 errors:554050781055
> dropped:554050781055 overruns:0 frame:2770253905275
> TX packets:1662152390966 errors:554050781055 dropped:0
> overruns:0 carrier:0
> collisions:554050781055 txqueuelen:1000
> RX bytes:554069270089 (516.0 GiB) TX bytes:554057269023
> (516.0 GiB)
> Interrupt:16
>
>
> 5 minutes later:
>
> eth1 Link encap:Ethernet HWaddr 00:19:B9:F9:3E:1A
> inet addr:94.75.226.2 Bcast:94.75.226.63 Mask:255.255.255.192
> inet6 addr: fe80::219:b9ff:fef9:3e1a/64 Scope:Link
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> RX packets:5527623063351 errors:1842540969555
> dropped:1842540969555 overruns:0 frame:9212704847775
> TX packets:5527622956466 errors:1842540969555 dropped:0
> overruns:0 carrier:0
> collisions:1842540969555 txqueuelen:1000
> RX bytes:1842559458589 (1.6 TiB) TX bytes:1842547457523 (1.6 TiB)
> Interrupt:16
>
>
> 10 minutes later:
>
> eth1 Link encap:Ethernet HWaddr 00:19:B9:F9:3E:1A
> inet addr:94.75.226.2 Bcast:94.75.226.63 Mask:255.255.255.192
> inet6 addr: fe80::219:b9ff:fef9:3e1a/64 Scope:Link
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> RX packets:9393093628851 errors:3131031158055
> dropped:3131031158055 overruns:0 frame:15655155790275
> TX packets:9393093521966 errors:3131031158055 dropped:0
> overruns:0 carrier:0
> collisions:3131031158055 txqueuelen:1000
> RX bytes:3131049647089 (2.8 TiB) TX bytes:3131037646023 (2.8 TiB)
> Interrupt:16
>
> and so on...
> --
> To unsubscribe from this list: send the line "unsubscribe linux-net" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Jeremy Jackson
Coplanar Networks
(519)489-4903
http://www.coplanar.net
jerj@xxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html