Re: 2.0.27 major problems #1 -- 3c59x driver.

Paul T Danset (pdanset@u.washington.edu)
Tue, 11 Feb 1997 15:35:40 -0800 (PST)


On Tue, 11 Feb 1997, Chris Evans wrote:

>
> This bug has been plaguing Linux long enough; if you have a test case
> where you can reproduce the repeated "access conflict" scenario, then by
> all means get into contact with the author who might be able to send some
> debugging code to help him sort it out.

Been there, done that. :)

> I was just about to knock up a patch like this.... it _still_ hangs??
> Could you elaborate more?

It remains operable even after many "Trasmitter access conflict"s occur.
However, after running several hours, sometimes it will hang just like
before. Admittedly, I'm doing some pathological stress testing, e.g.:

hosta# ping -fs 65000 hostb
hostb# ping -fs 4000 hosta

The one sending smaller packets usually hangs.

> Note that the code claims an access conflict can only occur if, quote,
> "the queue layer is doing something evil". Is this likely to be the case?

I used to think that it is perhaps not a bug in the driver, but a problem
with other network layers. I've since changed my mind; here's why:

o The only people that seem to be having the "Transmitter access conflict"
+ network hang problem are those on busy network with drivers that are
very similar in structure to 3c59x.c. For example, I've never seen
people complaining of similar problems with the smc-ultra.c driver ...
even though it was also written in large part by the author of 3c59x.c.
smc-ultra.c looks quite different from 3c59x.c, perhaps because Donald
had to work around an interrupt related bug by using busy-waiting. If
the problem was with another part of the network layer, then we should be
seeing errors from many other boards.

o I also don't think it is a hardware problem, since the same machine
converted to NT 4.0 has never experienced a hang. (There seems to
be more and more of these sprouting around here, like pods in the
"Invasion of the Body Snatchers" :)

> Time I sent a ratty e-mail to 3-com for not writing a driver themselves
> :-)

I see your smiley, but just in case ... please don't! :) If it weren't for
Donald B., Paul G., Alan C. and many others, as well as support from the
commercial side like Cameron Spitzer of 3Com, Linux networking would
probably be done via smoke signals. :)

I would, however, appreciate any comment from 3Com about how timeouts are
handled gracefully in drivers for other OSes (e.g. NT) ...

-- Paul