Re: tulip driver in 2.1.11* - 2.1.21 is broken - new driver

Linus Torvalds (torvalds@transmeta.com)
Sat, 19 Sep 1998 11:28:06 -0700 (PDT)


On Sat, 19 Sep 1998, Alan Cox wrote:

> > I think both dev->interrupt and dev->tbusy should just go away. Both of
> > them are completely broken, for several reasons:
> > - they use up a long-word for _one_ bit of information. That's 63 wasted
> > bits on some architectures.
>
> They are performance important. And afaik you btw made dev->interrupt a long
> word to cover for the fact the atomic stuff doesnt work on char ;)

Correct. But I did that without grepping for the use of it. Now that I
have, it's very obvious that I shouldn't have made it a long, I should
just have gotten rid of it completely. I don't find anything actually
using it, except for a few drivers who caused the problem in the first
place.

As for tbusy, I know it's supposed to be used for flow control. Grepping
for it in the driver directory showed very few cases of that, and almost
every use seemed to be about using it for re-entrancy.

More importantly, grepping for it shows that EVERY SINGLE USE OF IT IS
WRONG!

As such, it would be a lot better to just get rid of it completely, and
instead introduce a "dev->flags" field, and make sure that we _only_ use
that for what it's supposed to be used for.

Tell me if I'm wrong on any of the following. I may be, but I can't find
why:

Chapter 1
"Why tbusy is wrong"
by
Linus Torvalds

- grep for it
99% of all uses are non-atomic. That's obviously wrong in itself,
as most of those are outside any global interrupt locks, and the
flag is tested and set by interrupts, not by bottom half handlers.

The ones that ARE atomic are wrong too, because those seem to all
be used for re-entrancy testing rather than for flow control.

Ergo, every single use of tbusy seems to be buggy. Better just get
rid of it.

- qdisc_wakeup()
This is the main routine that is supposed to do the re-entrancy
test. To some degree this is the "heart" of tbusy. And even this
simple routine gets it utterly wrong.

It's testing tbusy without setting it, and without having disabled
the events that can change it. As such, it is neither UP nor SMP
safe - it happens to work apparently because all drivers when they
change their state of tbusy will also force a NET_BH to be run,
even if they don't actually have any incoming packets.

Or maybe it works due to the window being fairly small, and us
being lucky - packet re-send just restarts the transmission even
if it died for a while.

Chapter 2
"How to fix tbusy without going crazy"

- get rid of it
There are no correct users of it right now anyway, so there are no
instances of it being used that we want to save. Getting rid of it
will at least allow the compiler to find all cases where somebody
_tries_ to use it.

- instantiate a new "dev->flags"
Make qdisk_wakeup() do a

if (!test_and_set_bit(DEV_TRANSMIT, &dev->flags)) {
struct Qdisc *q = dev->qdisc;
if (qdisc_restart(dev) && q->h.forw == NULL) {
q->h.forw = qdisc_head.forw;
qdisc_head.forw = &q->h;
}
}

and then make the cases where the network drivers marked tbusy = 0
(the correct ones - the ones that tell the system that its write
queues are empty - this is about 10% of the tbusy use as far as I
can tell) do a

if (!test_and_clear_bit(DEV_TRANSMIT, &dev->flags)) {
panic("buggy driver, cleared DEV_TRANSMIT twice");
}

when their transmit buffer empties.

The above is what tbusy _should_ be doing already, but implemented right.

Am I wrong?

(Oh, "Chapter 3" is very short, and only has a heading and no actual text.
It says "Remove dev->interrupt, because nobody is using it").

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/