Re: [PATCH] x86: slightly shorten __ticket_spin_trylock() (v2)

From: Linus Torvalds
Date: Thu Dec 03 2009 - 16:12:21 EST




On Thu, 3 Dec 2009, Jan Beulich wrote:
>
> -static __always_inline int __ticket_spin_trylock(raw_spinlock_t *lock)
> +static __always_inline u8 __ticket_spin_trylock(raw_spinlock_t *lock)
> {
> int tmp, new;
>
> @@ -88,12 +88,11 @@ static __always_inline int __ticket_spin
> LOCK_PREFIX "cmpxchgw %w1,%2\n\t"
> "1:"
> "sete %b1\n\t"
> - "movzbl %b1,%0\n\t"
> : "=&a" (tmp), "=&q" (new), "+m" (lock->slock)
> :
> : "memory", "cc");
>
> - return tmp;
> + return new;

This is fairly pessimal register allocation.

It used to be that we returned the value in 'tmp', which is %eax, which is
also the expected return register. Now that we use 'new', it's some random
other register that is _not_ %eax, which means that while we avoid a
'movzbl', the regular spin_trylock function call case will now have the
compiler emitting a

movb %X,%al
ret

at the end just to get the right return register.

Which seems a bit annoying. It looks like we could have done that last
instruction as just

sete %b0

instead, and then return 'tmp' instead of 'new', keeping the return value
in %al and avoiding the unnecessary movement.

> -static __always_inline int __ticket_spin_trylock(raw_spinlock_t *lock)
> +static __always_inline u8 __ticket_spin_trylock(raw_spinlock_t *lock)

Same thing here, afaik.

NOTE! I haven't actually looked at the generated code, and if we actually
inline it all the way to the caller, it won't matter (and picking another
register may even help). But while these helpers are marked
__always_inline, I _thought_ that the way we actually build the final
'spin_trylock()' function we end up with a real function in the end.

Maybe I'm wrong.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/