Today Linus redesigns the networking driver interface (was Re: tulip driver in ...)

Alan Cox (alan@lxorguk.ukuu.org.uk)
Sat, 19 Sep 1998 20:39:57 +0100 (BST)


> - grep for it
> 99% of all uses are non-atomic. That's obviously wrong in itself,
> as most of those are outside any global interrupt locks, and the
> flag is tested and set by interrupts, not by bottom half handlers.

It is always set by an interrupt or bh atomic code. It is always tested
by bh atomic code. The case where a test returns a stale tbusy=0 is safe
also.

So its right but for all the wrong reasons

> Ergo, every single use of tbusy seems to be buggy. Better just get
> rid of it.

And later on you propose replacing it with itself but broken. Renaming it
to avoid accidents makes sense. OK

> It's testing tbusy without setting it, and without having disabled
> the events that can change it. As such, it is neither UP nor SMP

That is safe

> safe - it happens to work apparently because all drivers when they
> change their state of tbusy will also force a NET_BH to be run,
> even if they don't actually have any incoming packets.

That is -explicitly- required by the use of tbusy and it was always designed
like that. You have to schedule a bh on a tbusy clear otherwise nothing in
the kernel will feed you packets. You can't change that. Your design
below requires fixing to do this. Someone has to schedule the delivery
of further packets. Therefore you must use mark_bh()

> Or maybe it works due to the window being fairly small, and us
> being lucky - packet re-send just restarts the transmission even
> if it died for a while.

It works fine. The broken case is the fact that tbusy isnt enforced on the
first send. To fix that requires putting in a dev->timeout handler for
each device. If you break that behaviour then the moment your queue is full
and you take a tx timeout you are dead.

> instances of it being used that we want to save. Getting rid of it
> will at least allow the compiler to find all cases where somebody
> _tries_ to use it.

Ok

> if (!test_and_set_bit(DEV_TRANSMIT, &dev->flags)) {

test_and_set_bit has fairly poor performance on some non intel
architectures because it is implemented with spin locks. We want something
that can use things like sparc exchange with 0xFF locking.

(also dev->flags is already all reserved). dev->flow ?

> and then make the cases where the network drivers marked tbusy = 0
> (the correct ones - the ones that tell the system that its write
> queues are empty - this is about 10% of the tbusy use as far as I
> can tell) do a
>
> if (!test_and_clear_bit(DEV_TRANSMIT, &dev->flags)) {
> panic("buggy driver, cleared DEV_TRANSMIT twice");
> }
>
> when their transmit buffer empties.

That adds potential races and makes life extremely hard. It also doesnt
work as you need to mark_bh(NET_BH) so net_bh feeds it more packets.

Clearing the flag when it is already clear is perfectly reasonable behaviour,
there are several times when only looking at the flag will tell you if the
driver is in a state it needs to clear it. Spending all the time
generating

if(x)
{
if(!x)
panic()
}

seems silly. Clearing it twice is at absolute worst a minor performance
nit so it should not be a panic() anyway

> (Oh, "Chapter 3" is very short, and only has a heading and no actual text.
> It says "Remove dev->interrupt, because nobody is using it").

Nod.

Add Chapter 4. Fix every use of timeouts in each ethernet and non ethernet
driver.

Chapter 5. Recruit Staff ;)

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/