Re: [PATCH -v4 5/7] locking, arch: Update spin_unlock_wait()

From: Peter Zijlstra
Date: Fri Jun 03 2016 - 09:49:15 EST


On Fri, Jun 03, 2016 at 01:47:34PM +0100, Will Deacon wrote:
> > Now, the normal atomic_foo_acquire() stuff uses smp_mb() as per
> > smp_mb__after_atomic(), its just ARM64 and PPC that go all 'funny' and
> > need this extra barrier. Blergh. So lets shelf this issue for a bit.
>
> Hmm... I certainly plan to get qspinlock up and running for arm64 in the
> near future, so I'm not keen on shelving it for very long.

Sure; so short term we could always have arm64/ppc specific versions of
these functions where the difference matters.

Alternatively we need to introduce yet another barrier like:

smp_mb__after_acquire()

Or something like that, which is a no-op by default except for arm64 and
ppc.

But I'm thinking nobody really wants more barrier primitives :/ (says he
who just added one).

> > This unordered store however, can be delayed (store buffer) such that
> > the loads from spin_unlock_wait/spin_is_locked can pass up before it
> > (even on TSO arches).
>
> Right, and this is surprisingly similar to the LL/SC problem imo.

Yes and no. Yes because its an unordered store, no because a competing
ll/sc cannot make it fail the store and retry as done per your and
boqun's fancy solution.