Re: [RFC][PATCH 0/5] preempt_count rework

From: H. Peter Anvin
Date: Wed Aug 14 2013 - 09:48:52 EST

On 08/14/2013 06:15 AM, Peter Zijlstra wrote:
> These patches optimize preempt_enable by firstly folding the preempt and
> need_resched tests into one -- this should work for all architectures. And
> secondly by providing per-arch preempt_count implementations; with x86 using
> per-cpu preempt_count for fastest access.
> These patches have so far only been compiled for defconfig-x86_64 +
> CONFIG_PREEMPT=y and boot tested with kvm -smp 4 upto wanting to mount root.
> It still needs asm volatile("call preempt_schedule": : :"memory"); as per
> Andi's other patches to avoid the C calling convention cluttering the
> preempt_enable() sites.


I still don't see this using a decrement of the percpu variable
anywhere. The C compiler doesn't know how to generate those, so if I'm
not completely wet we will end up relying on sub_preempt_count()...
which, because it relies on taking the address of the percpu variable
will generate absolutely horrific code.

On x86, you never want to take the address of a percpu variable if you
can avoid it, as you end up generating code like:

movq %fs:0,%rax
subl $1,(%rax)

... for absolutely no good reason. You can use the existing accessors
for percpu variables, but that would make you lose the flags output
which was part of the point, so I think the whole sequence needs to be
in assembly (note that once you are manipulating percpu state you are
already in assembly.)


