Re: [RFC][PATCH 0/5] preempt_count rework

From: Peter Zijlstra
Date: Wed Aug 14 2013 - 12:06:47 EST


On Wed, Aug 14, 2013 at 05:39:11PM +0200, Mike Galbraith wrote:
> On Wed, 2013-08-14 at 06:47 -0700, H. Peter Anvin wrote:
>
> > On x86, you never want to take the address of a percpu variable if you
> > can avoid it, as you end up generating code like:
> >
> > movq %fs:0,%rax
> > subl $1,(%rax)
>
> Hmmm..
>
> #define cpu_rq(cpu) (&per_cpu(runqueues, (cpu)))
> #define this_rq() (&__get_cpu_var(runqueues))
>
> ffffffff81438c7f: 48 c7 c3 80 11 01 00 mov $0x11180,%rbx
> /*
> * this_rq must be evaluated again because prev may have moved
> * CPUs since it called schedule(), thus the 'rq' on its stack
> * frame will be invalid.
> */
> finish_task_switch(this_rq(), prev);
> ffffffff81438c86: e8 25 b4 c0 ff callq ffffffff810440b0 <finish_task_switch>
> * The context switch have flipped the stack from under us
> * and restored the local variables which were saved when
> * this task called schedule() in the past. prev == current
> * is still correct, but it can be moved to another cpu/rq.
> */
> cpu = smp_processor_id();
> ffffffff81438c8b: 65 8b 04 25 b8 c5 00 mov %gs:0xc5b8,%eax
> ffffffff81438c92: 00
> rq = cpu_rq(cpu);
> ffffffff81438c93: 48 98 cltq
> ffffffff81438c95: 48 03 1c c5 00 f3 bb add -0x7e440d00(,%rax,8),%rbx
>
> ..so could the rq = cpu_rq(cpu) sequence be improved cycle expenditure
> wise by squirreling rq pointer away in a percpu this_rq, and replacing
> cpu_rq(cpu) above with a __this_cpu_read(this_rq) version of this_rq()?

Well, this_rq() should already get you that. The above code sucks for
using cpu_rq() when we know cpu == smp_processor_id().
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/