Re: Notebooks

Andi Kleen (ak@muc.de)
Wed, 19 Aug 1998 08:26:20 +0200


On Wed, Aug 19, 1998 at 08:18:40AM +0200, Linus Torvalds wrote:
>
>
> On Wed, 19 Aug 1998, Andi Kleen wrote:
> >
> > My theory is that it is caused by better cache line usage. In the bulk
> > transfer case most packets have the same size (device MTU), and then
> > the cache wasn't effectively used. slab fixes that. Also the other
> > sk_buff code has been simplified which should speed it up too.
>
> Hmmm.. Slab does cache-align the allocations, something that the original
> code didn't try to do. The old kmalloc() tried to be space-efficient even
> if it meant returning an allocation at even just a 4-byte boundary, which
> might have been a mistake.
>
> I don't think the colouring should matter - at least I've never really
> seen any of that on my machines. But I've been playing with PPro's and
> PII's, mostly, and I think they are all four-way - if you did your numbers
> on a Pentium (no-MMX) it's only two-way set associative, and cache
> colouring may be more of an issue for you.

My benchmarks were run on a pre-MMX P90 yes.

The fast routing benchmarks ran on AMD K6 133, and various Pentia (not sure
whether MMX or not)

I originally did cycle counter tests too. I can't find the exact numbers
anymore, but the result was that skb_alloc() alone was slower by a few
cycles than the old skb_alloc(), but overall - with kfree_skb() and some
processing included it was faster than the old code.

The additional memory use isn't too bad, about one page for every 25
skbuffs [sizeof(struct sk_buff) is 148bytes currently + some slab
padding]

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html