Re: [Bug #11308] tbench regression on each kernel release from2.6.22 -> 2.6.28

From: Ingo Molnar
Date: Mon Nov 17 2008 - 12:09:31 EST



* Eric Dumazet <dada1@xxxxxxxxxxxxx> wrote:

> Ingo Molnar a écrit :
>> * Eric Dumazet <dada1@xxxxxxxxxxxxx> wrote:
>>
>>>> It all looks like pure old-fashioned straight overhead in the
>>>> networking layer to me. Do we still touch the same global cacheline
>>>> for every localhost packet we process? Anything like that would
>>>> show up big time.
>>> Yes we do, I find strange we dont see dst_release() in your NMI
>>> profile
>>>
>>> I posted a patch ( commit 5635c10d976716ef47ae441998aeae144c7e7387
>>> net: make sure struct dst_entry refcount is aligned on 64 bytes) (in
>>> net-next-2.6 tree) to properly align struct dst_entry refcounter and
>>> got 4% speedup on tbench on my machine.
>>
>> Ouch, +4% from a oneliner networking change? That's a _huge_ speedup
>> compared to the things we were after in scheduler land. A lot of
>> scheduler folks worked hard to squeeze the last 1-2% out of the
>> scheduler fastpath (which was not trivial at all). The _full_
>> scheduler accounts for only about 7% of the total system overhead here
>> on a 16-way box...
>
> 4% on my machine, but apparently my machine is sooooo special (see
> oprofile thread), so maybe its cpus have a hard time playing with a
> contended cache line.
>
> It definitly needs more testing on other machines.
>
> Maybe you'll discover patch is bad on your machines, this is why
> it's in net-next-2.6

ok, i'll try it on my testbox too, to check whether it has any effect
- find below the port to -git.

tbench _is_ very sensitive to seemingly small details - it seems to be
hoovering at around some sort of CPU cache boundary and penalizing
random alignment changes, as we drop in and out of the sweet spot.

Mike Galbraith has been spending months trying to pin down all the
issues.

Ingo

------------->