Re: alpha cmpxchg.h (was Re: [PATCH v2 cmpxchg 12/13] sh: Emulate one-byte cmpxchg)

From: Linus Torvalds
Date: Thu May 02 2024 - 18:17:24 EST


On Thu, 2 May 2024 at 14:01, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>
> +static inline unsigned long
> +____xchg_u8(volatile char *m, unsigned long val)
> +{
> + unsigned long ret, tmp, addr64;
> +
> + __asm__ __volatile__(
> + " andnot %4,7,%3\n"
> + " insbl %1,%4,%1\n"
> + "1: ldq_l %2,0(%3)\n"
> + " extbl %2,%4,%0\n"
> + " mskbl %2,%4,%2\n"
> + " or %1,%2,%2\n"
> + " stq_c %2,0(%3)\n"
> + " beq %2,2f\n"
> + ".subsection 2\n"
> + "2: br 1b\n"
> + ".previous"
> + : "=&r" (ret), "=&r" (val), "=&r" (tmp), "=&r" (addr64)
> + : "r" ((long)m), "1" (val) : "memory");
> +
> + return ret;
> +}

Side note: if you move this around, I think you should just uninline
it too and turn it into a function call.

This inline asm doesn't actually take any advantage of the inlining.
The main reason to inline something like this is that you could then
deal with different compile-time alignments better than using the
generic software sequence. But that's not what the inline asm actually
does, and it uses the worst-case code sequence for inserting the byte.

Put that together with "byte and word xchg are rare", and it really
smells to me like we shouldn't be inlining this.

Now, the 32-bit and 64-bit cases are different - more common, but also
much simpler code sequences. They seem worth inlining.

That said, maybe for alpha, the "just move code around" is better than
"fix up old bad decisions" just because the effort is lower.

Linus