Still e1000-problems on 2.6.1: boot-specific network error behavior, lockups
From: Erkki Seppala
Date: Wed Jan 14 2004 - 15:03:55 EST
The card is the 64bit server edition:
00:08.0 Ethernet controller: Intel Corp. 82544EI Gigabit Ethernet
Controller (Copper) (rev 02)
The kernel is: Linux dyton 2.6.1lirc-lufs-dyton30 #6 SMP Sun Jan 11
18:43:06 EET 2004 i686 GNU/Linux
Recently I've been noticing the error counts on ifconfig eth0
increasing. What makes it more interesting, is that with some boots
the numbers stay at zero and with some boots the situation is
completely intolerable. Sometimes during the operation it may drop the
link; rmmod e1000 and modprobe e1000 will recover it - that is, if
both of the commands succeed without hanging up the machine, which
isn't always the case. Sometimes replugging the cable to the switch
brings the link up.
My current uptime record with 2.6.* (including test-kernels) is
somewhere around 10 days; with this problem I'm happy to reach 24
hours. Difference from the earlier setup is one dvd-rw-drive and
loaded ide-support, but I would expect the 430W power to be sufficient
for this system and a non-used ide-support to not affect.
eth0 looks like this at the moment:
eth0 Link encap:Ethernet HWaddr 00:02:B3:A3:50:0E
inet addr:192.168.2.38 Bcast:192.168.2.255 Mask:255.255.255.0
inet6 addr: fe80::202:b3ff:fea3:500e/64 Scope:Link
inet6 addr: 2001:708:310:4a00:202:b3ff:fea3:500e/64 Scope:Global
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:140019 errors:3897 dropped:0 overruns:0 frame:3264
TX packets:157062 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:92177901 (87.9 MiB) TX bytes:113040323 (107.8 MiB)
Interrupt:10 Base address:0x1000 Memory:b0820000-b0840000
As said, both errors and frames are zero at some boots. I infact did
some transferring of /dev/zero, about 30 gigabytes of it, with no
errors. However, the system rebooted spontaneously at that point..
The behavior would seem to have appeared after 2.6.0-testsomething, or
perhaps after 2.6.0. It would seem to me there are no changes in the
driver in that period, so perhaps this is related to some
initialization glitch due to some hardware combination?-o
I've disabled tso on the card per suggestion from Scott Feldman, which
removes the problem with iptables corrupting the packets. (After which
I also removed iptables from the kernel - my current kernel does
include it.) The offloading parameters are as follows:
# ethtool -k eth0
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: off
At the moment ethtool -k eth0 | grep -v ': 0$' displays:
NIC statistics:
rx_packets: 173519
tx_packets: 195596
rx_bytes: 112062405
tx_bytes: 140458927
rx_errors: 4787
multicast: 6
rx_crc_errors: 3998
rx_csum_offload_good: 173361
rx_csum_offload_errors: 1
Note that at one point I considered the option of a broken e1000,
however I don't have a replacement to test out that theory with. So if
this stuff seems too insane, perhaps it's just that ;).
Information about my system is available at:
http://www.modeemi.cs.tut.fi/~flux/e1000/
I've been considering getting some other card, I'll let you know if it
fixes the problem. (If it doesn't, it would indicate bad cables or
switch plus some other possibly unrelated hardware/os-problem - or a
broken e1000.)
--
_____________________________________________________________________
/ __// /__ ____ __ http://www.modeemi.fi/~flux/\ \
/ /_ / // // /\ \/ / \ /
/_/ /_/ \___/ /_/\_\@modeemi.fi \/
-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html