Re: skb_release_head_state(): Re: [Bug #11308] tbench regressionon each kernel release from 2.6.22 -> 2.6.28

From: Ingo Molnar
Date: Mon Nov 17 2008 - 16:38:35 EST



* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Mon, 17 Nov 2008, Ingo Molnar wrote:
> >
> > this function _really_ hurts from a 16-bit op:
> >
> > ffffffff8048943e: 6503 66 c7 83 a8 00 00 00 movw $0x0,0xa8(%rbx)
> > ffffffff80489445: 0 00 00
> > ffffffff80489447: 174101 5b pop %rbx
>
> I don't think that is it, actually. The 16-bit store just before it
> had a zero count, even though anything that executes the second one
> will always execute the first one too.

yeah - look at the followup bits that identify the likely real source
of that overhead:

>> _But_, the real overhead probably comes from:
>>
>> ffffffff804b7210: 10867 48 8b 54 24 58 mov 0x58(%rsp),%rdx
>>
>> which is the next line, the ttl field:
>>
>> 373 iph->ttl = ip_select_ttl(inet, &rt->u.dst);
>>
>> this shows that we are doing a hard cachemiss on the net-localhost
>> route dst structure cacheline. We do a plain load instruction from
>> it here and get a hefty cachemiss. (because 16 CPUs are banging on
>> that single route)
>>
>> And let make sure we see this in perspective as well: that single
>> cachemiss is _1.0 percent_ of the total tbench cost. (!) We could
>> make the scheduler 10% slower straight away and it would have less
>> of a real-life effect than this single iph->ttl field setting.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/