On Fri, 12 May 2000, Andrew Morton wrote:
> Donald Becker wrote:
> > > > Support for this NIC was introduced in 2.2.16pre2
> > Actually, support was available in May 1999 with the 99L driver using a
> > general catch-all for 900* cards, and September 1999 as an explicitly
> > identified card. It could have been in the kernel far earlier than now.
> You are correct. In this case, "support" simply refers to adding the
> PCI device IDs. The catchall will indeed allow the 2.2.x driver to
> support device ID 9004. (And the catchall trick doesn't appear to be
> supported by the 2.3.x PCI scanning code).
Which is a bad thing.
My PCI scan code had it for a well established reason.
> > This is NOT a magic CPU software timing loop. I don't write such things.
> > This is a completion check based on the value of interest -- PCI bus cycles.
> > Please understand what the code is doing before changing it.
> I'll agree with you there - we don't fully understand what is going on.
> The extraordinarily long times are observed when a download stall
> command is used on a heavily loaded, hubbed LAN. A LAN on which there
> are many collisions. A LAN on which I can easily force the '16
> successive collisions' condition.
Which chip was this with?
The download stall should be an instantaneous thing. The CPU must have the
PCI bus to send the command, and the chip only needs to stop requesting the
bus for downloading packets. At most the chip should finish a transfer it
had already requested, which should happen before reading the status.
This behavior might have changed with the newer chips. The Download-Stall
method is no longer required, and the semantics might have been
unintentionally changed to "break" the command. If so, we will have to add
another transmit and receive function pair to handle Tornado chips (which
might not be a bad idea anyway). Note that I'm only guessing here -- it
could be another bug that we are seeing, such as a interrupt that is occuring
during the CommandComplete check.
Given that there are alread two (and perhaps a third) Tx and Rx methods, the
natural question is why the driver has not been split.
The answer is that the Tx and Rx paths are very compact. They have to be
that way for good performance. Having multiple functions doesn't add much
to the driver size. But the large, complex media selection code is shared
across generations and card types.
Splitting the driver up by generation would result in three slightly smaller
drivers, with a total size much larger than now and version skew in the
media selection code.
[[ In fact the driver was split during development of the 3c900 support.
You might still find the drivers "vortex.c" and "boomerang.c" in the dusty
corners of the web. ]]
> I suspect that the NIC has started to send a packet, encounters a
> collision and somehow blocks the downstall completion until it reaches a
> suitable state. It _should_ just give up and leave the packet in the Tx
> Bogdan was getting driver failure after <30 minutes with the 2.2
> driver. After upping the counter he has run his tests for four days.
> He's on a switched LAN, which tends to torpedo the above theories...
I'm going with either
- new hardware badly emulates the old behavior in firmware, expecting that
nothing was using it.
- a misinterpretation what is going on.
> I'm not happy with dumbly upping the value and walking away - more work
> will be done to find out exactly what is causing this. Fortunately it
> is easy to reproduce.
4000 PCI cycles is a *very* long time. Ages. It should only 0 to 2 PCI
cycles to queue a packet.
Donald Becker firstname.lastname@example.org
Scyld Computing Corporation
410 Severn Ave. Suite 210
Annapolis MD 21403
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to email@example.com
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Mon May 15 2000 - 21:00:19 EST