Re: [PATCH 3/3] locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64

From: Ingo Molnar
Date: Tue Apr 09 2024 - 03:51:03 EST



* Uros Bizjak <ubizjak@xxxxxxxxx> wrote:

> Commit:
>
> 6d12c8d308e68 ("percpu: Wire up cmpxchg128")
>
> improved emulated cmpxchg8b_emu() library function to return
> success/failure in a ZF flag.
>
> Define arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 targets
> to override the generic archy_try_cmpxchg() with an optimized
> target specific implementation that handles ZF flag.
>
> The assembly code at the call sites improves from:
>
> bf56d: e8 fc ff ff ff call cmpxchg8b_emu
> bf572: 8b 74 24 28 mov 0x28(%esp),%esi
> bf576: 89 c3 mov %eax,%ebx
> bf578: 89 d1 mov %edx,%ecx
> bf57a: 8b 7c 24 2c mov 0x2c(%esp),%edi
> bf57e: 89 f0 mov %esi,%eax
> bf580: 89 fa mov %edi,%edx
> bf582: 31 d8 xor %ebx,%eax
> bf584: 31 ca xor %ecx,%edx
> bf586: 09 d0 or %edx,%eax
> bf588: 0f 84 e3 01 00 00 je bf771 <...>
>
> to:
>
> bf572: e8 fc ff ff ff call cmpxchg8b_emu
> bf577: 0f 84 b6 01 00 00 je bf733 <...>
>
> No functional changes intended.

Side note: while there's no hard-written rule for it, I tend to use the 'no
functional changes intended' line for pure identity transformations - which
this one isn't, as it changes code generation materially.

So I removed that line - the explanation of the patch is clear enough IMO.

Thanks,

Ingo