Re: locking/atomic: Introduce atomic_try_cmpxchg()

From: Dmitry Vyukov
Date: Mon Mar 27 2017 - 09:46:53 EST


On Mon, Mar 27, 2017 at 2:16 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Fri, Mar 24, 2017 at 01:44:00PM +0100, Dmitry Vyukov wrote:
>> Hi,
>>
>> I've come across:
>>
>> commit a9ebf306f52c756c4f9e50ee9a60cd6389d71344
>> Author: Peter Zijlstra
>> Date: Wed Feb 1 16:39:38 2017 +0100
>> locking/atomic: Introduce atomic_try_cmpxchg()
>>
>> The primitive has subtle difference with all other implementation that
>> I know of, and can lead to very subtle bugs. Some time ago I've spent
>> several days debugging a memory corruption caused by similar
>> implementation.
>
> OK, so how about this?
>
> ---
> Subject: atomic: Fix try_cmpxchg semantics
> From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Date: Mon Mar 27 13:54:38 CEST 2017
>
> Dmitry noted that the new try_cmpxchg() primitive is broken when the
> old pointer doesn't point to local stack.
>
> He writes: "Consider a classical lock-free stack push:
>
> node->next = atomic_read(&head);
> do {
> } while (!atomic_try_cmpxchg(&head, &node->next, node));
>
> This code is broken with the current implementation, the problem is
> with unconditional update of *__po.
>
> In case of success it writes the same value back into *__po, but in
> case of cmpxchg success we might have lose ownership of some memory
> locations and potentially over what __po has pointed to. The same
> holds for the re-read of *__po. "
>
> He also points out that this makes it surprisingly different from the
> similar C/C++ atomic operation.
>
> After investigating the code-gen differences caused by this patch; and
> a number of alternatives (Linus dislikes this interface lots), we
> arrived at these results (size x86_64-defconfig/vmlinux):
>
> GCC-6.3.0:
>
> 10735757 cmpxchg
> 10726413 try_cmpxchg
> 10730509 try_cmpxchg + patch
> 10730445 try_cmpxchg-linus
>
> GCC-7 (20170327):
>
> 10709514 cmpxchg
> 10704266 try_cmpxchg
> 10704266 try_cmpxchg + patch
> 10704394 try_cmpxchg-linus
>
> From this we see that the patch has the advantage of better code-gen on
> GCC-7 and keeps the interface roughly consistent with the C language
> variant.
>
> Fixes: a9ebf306f52c ("locking/atomic: Introduce atomic_try_cmpxchg()")
> Reported-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> ---
> arch/x86/include/asm/cmpxchg.h | 5 +++--
> include/linux/atomic.h | 16 ++++++++++------
> 2 files changed, 13 insertions(+), 8 deletions(-)
>
> --- a/arch/x86/include/asm/cmpxchg.h
> +++ b/arch/x86/include/asm/cmpxchg.h
> @@ -212,8 +212,9 @@ extern void __add_wrong_size(void)
> default: \
> __cmpxchg_wrong_size(); \
> } \
> - *_old = __old; \
> - success; \
> + if (unlikely(!success)) \
> + *_old = __old; \
> + likely(success); \
> })
>
> #define __try_cmpxchg(ptr, pold, new, size) \
> --- a/include/linux/atomic.h
> +++ b/include/linux/atomic.h
> @@ -428,9 +428,11 @@
> #define __atomic_try_cmpxchg(type, _p, _po, _n) \
> ({ \
> typeof(_po) __po = (_po); \
> - typeof(*(_po)) __o = *__po; \
> - *__po = atomic_cmpxchg##type((_p), __o, (_n)); \
> - (*__po == __o); \
> + typeof(*(_po)) __r, __o = *__po; \
> + __r = atomic_cmpxchg##type((_p), __o, (_n)); \
> + if (unlikely(__r != __o)) \
> + *__po = __r; \
> + likely(__r == __o); \
> })
>
> #define atomic_try_cmpxchg(_p, _po, _n) __atomic_try_cmpxchg(, _p, _po, _n)
> @@ -1022,9 +1024,11 @@ static inline int atomic_dec_if_positive
> #define __atomic64_try_cmpxchg(type, _p, _po, _n) \
> ({ \
> typeof(_po) __po = (_po); \
> - typeof(*(_po)) __o = *__po; \
> - *__po = atomic64_cmpxchg##type((_p), __o, (_n)); \
> - (*__po == __o); \
> + typeof(*(_po)) __r, __o = *__po; \
> + __r = atomic64_cmpxchg##type((_p), __o, (_n)); \
> + if (unlikely(__r != __o)) \
> + *__po = __r; \
> + likely(__r == __o); \
> })
>
> #define atomic64_try_cmpxchg(_p, _po, _n) __atomic64_try_cmpxchg(, _p, _po, _n)


Acked-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>