Re: Ethernet errors

Brian May (bam@snoopy.apana.org.au)
Thu, 30 Sep 1999 08:50:45 +1000


On Wed, Sep 29, 1999 at 04:17:44AM -0400, Paul Gortmaker wrote:
> Brian A May wrote:
> >
> > Hello All,
> >
> > It was recently suggested that I replace my Etherexpress adaptor
> > with a cheap NE2000 driver, in order to fix the problems.
> >
> > Fair enough, but I still having similar problems with an ISA NE2000
> > card. Alan Cox once told me that the cheap set was known to be good, and
> > it must be an interrupt conflict problem. While I disagreed with the
> > interrupt conflict (nothing else uses IRQ 11), I did change the IRQ from
> > 11 to 15. I thought it fixed the problem, but it still re-occurs.
> >
> > However, with 2.2.12, I get more error information:
> > Sep 28 08:45:10 snoopy kernel: eth0: mismatched read page pointers 0 vs ff.
> > Sep 28 08:45:10 snoopy kernel: eth0: Too much work at interrupt, status 0xff
> > Sep 28 08:45:30 snoopy kernel: eth0: trigger_send() called with the transmitter busy.
> > Sep 28 08:45:39 snoopy kernel: eth0: Tx timed out, excess collisions. TSR=0xff, ISR=0xff, t=907.
> > Sep 28 08:45:39 snoopy kernel: Hw. address read/write mismap 0
>
> [...]
>
> > The Ethernet adaptor just died.
>
> That appears to be an accurate description - all the 0xff values (that
> includes the m'cast reg read/write check Alan added) that are being
> read tend to indicate the card fell off the planet. (Oxff = vacant I/O
> port). You have some sort of hardware problem/misconfiguration here.

It works on startup... Why should it become misconfigured during
use? The only thing I can think of is that something is touching the
programmable setup...

Looking through 8390.c, I see:

/* Remove one frame from the ring. Boundary is always a page behind. */

this_frame = inb_p(e8390_base + EN0_BOUNDARY) + 1;
if (this_frame >= ei_local->stop_page)
this_frame = ei_local->rx_start_page;

/* Someday we'll omit the previous, iff we never get this message.
(There is at least one clone claimed to have a problem.) */
if (ei_debug > 0 && this_frame != ei_local->current_page)
printk(KERN_ERR "%s: mismatched read page pointers %2x vs %2x.\n",
dev->name, this_frame, ei_local->current_page);

Well... I guess you can't ommit it just yet (not that it helps much
- it still crashes).

> > Warm rebooting didn't fix my problem. It detected the adaptor
> > was on IRQ 3, it is actually on IRQ 15. In addition, my PPP connection
> > failed to start (probably because it needed IRQ 3).

Again - sounds like something messed up the settings. But what??? Linux
of course, as it should prevent any other programs from accessing the
card.

Something I just noticed. When warm rebooting from a crash, I get the following
message:

Sep 28 08:47:50 snoopy kernel: ne.c:v1.10 9/23/94 Donald Becker (becker@cesdis.gsfc.nasa.gov)
Sep 28 08:47:50 snoopy kernel: NE*000 ethercard probe at 0x300: 00 40 33 29 11 55
Sep 28 08:47:50 snoopy kernel: eth0: NE2000 found at 0x300, using IRQ 3.

But my card is 0x340 on IRQ 15!!!!

Whats special about 0x300 and 3?

This is how it should look:

Sep 28 08:53:39 snoopy kernel: ne.c:v1.10 9/23/94 Donald Becker (becker@cesdis.gsfc.nasa.gov)
Sep 28 08:53:39 snoopy kernel: NE*000 ethercard probe at 0x340: 00 40 33 29 11 55
Sep 28 08:53:39 snoopy kernel: eth0: NE2000 found at 0x340, using IRQ 15.

Something really screwy there!

Some signature, so it really does look like it has been reconfigured.

> > I had to cold boot (as always) to fix the problem.
>
> Is the card a PnP card? It almost sounds like the card is being
> deactivated and requires the PnP BIOS to reactivate it again via
> a cold boot.

No, the disk is not PnP. To configure it, you boot DOS, run a special
program that allows you to setup IO and IRQ values, and optionally save
them to EEPROM.

I don't have the foggiest idea how these cards are programmed (how
can the software program it when it doesn't know its IO address in
advance???). I think other cards do the same thing though.

> > Any ideas? I seem to be having lots of networking problems lately.
> > It would be really nice if I could fix just 1...
>
> Make sure the ISA bus is running default 8MHz, conservative settings
> for I/O cycle wait states (if supplied in CMOS setup), double check
> for I/O conflict with another device (and/or try different I/O

All settings look conservative (perhaps too conservative) to me:

Auto Configuration : Enabled
AT Bus Clock : PCICLKI/4 (this is as slow as it will get)
DRAM Read Wait States : 2
DRAM Write Wait Sates : 1
Cache Burst Read : 2-1-1-1
Internal Cache WB/WT : Write Thru
External Cache WB/WT : Write Back
[...]
Ext-Cache with Dirty Bit : Yes

I think that is everything relevant. There are more settings for PCI.
CPU Clock / PCI Clock : 1 : 1

This computer is 100MHz. Does that really mean that the PCI Clock is
100MHz. This sounds so unrealistic (compared with the age of this
computer), I will just assume I am mistaken

> location, and even try another ISA bus slot...). Try another ne2k
> card if you have one handy, or try that one in another machine.

I have another ne2k card, some brand. It seems to work fine in another
computer, but that might just be since it isn't transmitting as much (it
was receiving the data at the time the transmitting device failed). I
think I previously swapped the cards with the same result. Should I try
again (just to make sure)?

Also, I have a 3Com Etherlink III card (not mine, but at least I can
use it for testing). Maybe I should try it??? (After I work out how to
program it).

On Wed, Sep 29, 1999 at 11:00:22AM +0000, Thomas Speck wrote:
> I am getting logs like this with a Planet PCMCIA Ethernetcard ENW-3502
> (10Mbps) on a Highscreen laptop. I already reported this behavior some
> time ago on this list. Currently I use kernel 2.2.13pre9 (gcc version
> 2.7.2.3.f.1). The number of logs increases by high usage of the card
> that is by doing things like ping -f. I can also exclude IRQ-problems.

Does that card use 8390.o??? (eg like NE2k?) Just curious.

> > The Ethernet adaptor just died.
>
> My ethenet adaptor stays alive, however there is some tremendous slowdown
> in data transfer.

How do you fix the problem? Does warm reset help? cold reset? Removing
the PCMCIA card and replacing it? What PCMCIA chipset do you have?

I don't understand much about PCMCIA autoconfig, but if this is the same
problem, it doesn't look like my problem is due to the configuration
being currupted...

Is PCMCIA configuration likely to become currupted?

-- 
Brian May <bam@snoopy.apana.org.au>

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/