Re: [PATCH v8 6/7] apei/ghes: Use unrcu_pointer for cmpxchg

From: Ard Biesheuvel
Date: Thu Oct 13 2022 - 11:43:28 EST


On Thu, 13 Oct 2022 at 15:37, Borislav Petkov <bp@xxxxxxxxx> wrote:
>
> On Wed, Oct 12, 2022 at 12:04:57PM +0000, Justin He wrote:
> > I have a concern about what if cmpxchg failed? Do we have to still
> > guarantee the ordering since cmpxchg will not imply a smp_mb if it
> > failed.
>
> Of course it will imply that. At least on x86 it does. smp_wmb() is a
> compiler barrier there and cmpxchg() already has that barrier semantics
> by clobbering "memory". I'm pretty sure you should have the same thing
> on ARM.
>

No it definitely does not imply that. A memory clobber is a codegen
construct, and the hardware could still complete the writes in a way
that could result in another observer seeing a mix of old and new
values that is inconsistent with the ordering of the stores as issued
by the compiler.

> says, "new_cache must be put into array after its contents are written".
>
> Are we writing anything into the cache if cmpxchg fails?
>

The cache fields get updated but the pointer to the struct is never
shared globally if the cmpxchg() fails so not having the barrier on
failure should not be an issue here.

> > Besides, I didn't find the paired smp_mb or smp_rmb for this smp_wmb.
>
> Why would there be pairs? I don't understand that statement here.
>

Typically, the other observer pairs the write barrier with a read barrier.

In this case, the other observer appears to be ghes_estatus_cached(),
and the reads of the cache struct fields must be ordered after the
read of the cache struct's address. However, there is an implicit
ordering there through an address dependency (you cannot dereference a
struct without knowing its address) so the ordering is implied (and
rcu_dereference() has a READ_ONCE() inside so we are guaranteed to
always dereference the same struct, even if the array slot gets
updated concurrently.)

If you want to get rid of the barrier, you could drop it and change
the cmpxchg() to cmpxchg_release().

Justin: so why are the RCU_INITIALIZER()s needed here?