Re: Today Linus redesigns the networking driver interface (was Re: tulip driver in ...)

Linus Torvalds (torvalds@transmeta.com)
Sat, 19 Sep 1998 23:11:49 -0700 (PDT)


On Sat, 19 Sep 1998, Linus Torvalds wrote:
>
> The reason we don't get timer interrupts is not spl-levels. I bet it's
> just a fairly simple issue of CPU starvation. Fixing it may not be simple:
> fairness never is. But let's not panic.

Looking at net_bh(), it looks like that's the first suspect.

The reason the machine dies when you feed it incoming packets faster than
it can handle is very simple: net_bh() essentially becomes an endless
loop. What do you expect if you loop forever inside a software interrupt
handler?

The net_bh() code tries to be nice by looking at "jiffies", but by the
time you've spent more than a jiffy on endless incoming packets you've
spent way too much time and it's really much much too late.

There's code to do something more akin to the right thing with
"CONFIG_CPU_IS_SLOW". But even that seems to try to be overly clever. It
should be fairly easy to simply limit the maximum number of packets
processed in one net_bh() invocation (select a nice random number like 64
packets max per invocation and try different values to make sure you don't
start dropping until you really need to). And then you just drop the slop.
Unconditionally. Long before you've spent a jiffy on it.

So the current loop look something like this:

while (!skb_queue_empty(&backlog)) {
skb = skb_dequeue(&backlog);
.. look at jiffies etc ..
}

and it should probably just be something like this instead:

int max;

/* get the list off the backlog, empty the backlog */
spin_lock_irq(&skb_queue_lock);
skb = backlog.next;
backlog.prev->next = NULL;
backlog.next = &backlog;
backlog.prev = &backlog;
backlog.qlen = 0;
spin_unlock_irq(&skb_queue_lock);

/* Go through the list, max X packets */
max = 64;
while (skb) {
struct sk_buff *now = skb;
skb = skb->next;
now->next = NULL;
now->prev = NULL;
now->list = NULL;

if (!--max)
break;

handle_one_skb(now);
}

/* Free any remaining slop that we didn't have time to take care of */
while (skb) {
struct sk_buff *now = skb;
skb = skb->next;
kfree_skb(now);
}

which, btw, only gets the skb spinlock once instead of once per packet etc
etc.

The above should essentially guarantee that we spend as little time as
possible on handling too deep queues. If that's not enough, then we need
to start to flow control the interrupts themselves.

Has anybody tried the simple and obvious approach like the above?

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/