Re: [patch 04/15] Generic Mutex Subsystem,add-atomic-call-func-x86_64.patch

From: Nicolas Pitre
Date: Tue Dec 20 2005 - 14:48:31 EST


On Tue, 20 Dec 2005, Linus Torvalds wrote:

>
>
> On Tue, 20 Dec 2005, Nicolas Pitre wrote:
> >
> > Sure, and we're now more costly than the current implementation with irq
> > disabling.
>
> Do the timing. It may be more instructions, but I think it was you
> yourself that timed the current thing at 23 cycles, no?

Yes. And with my hand-optimized assembly version using
(the equivalent of) preempt_disable/preempt_enable I get 20 cycles over
14 instructions. If I let gcc do it using the standard macros it gets
worse.

The irq disabling implementation spends 23 cycles over 8 instructions.

The mutex swp-based implementation spends 8 cycles over 4 instructions.

If we make the preempt_disable/enable implementation out of line that'll
reduce the .text size bloat, but it'LL add more cycles due to the
additional two branches plus preserve of the return address making it
above 23 cycles for sure.

> Special registers are almost always slower than nicely cached accesses
> (and they all basically will be).

Sure, but do the numbers above tell you anything else? It seems pretty
obvious to me that simple mutexes are a real gain. Suffice for the
implementation to let architecture do their best to lock the mutex and
fall back to generic code when there is contention (just like we already
do for semaphores).


Nicolas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/