Re: spin_lock implicit/explicit memory barrier

From: Davidlohr Bueso
Date: Wed Aug 10 2016 - 15:20:48 EST


On Wed, 10 Aug 2016, Manfred Spraul wrote:

On 08/10/2016 02:05 AM, Benjamin Herrenschmidt wrote:
On Tue, 2016-08-09 at 20:52 +0200, Manfred Spraul wrote:
Hi Benjamin, Hi Michael,

regarding commit 51d7d5205d33 ("powerpc: Add smp_mb() to
arch_spin_is_locked()"):

For the ipc/sem code, I would like to replace the spin_is_locked() with
a smp_load_acquire(), see:

http://git.cmpxchg.org/cgit.cgi/linux-mmots.git/tree/ipc/sem.c#n367

http://www.ozlabs.org/~akpm/mmots/broken-out/ipc-semc-fix-complex_count-vs-simple-op-race.patch

To my understanding, I must now add a smp_mb(), otherwise it would be
broken on PowerPC:

The approach that the memory barrier is added into spin_is_locked()
doesn't work because the code doesn't use spin_is_locked().

Correct?
Right, otherwise you aren't properly ordered. The current powerpc locks provide
good protection between what's inside vs. what's outside the lock but not vs.
the lock *value* itself, so if, like you do in the sem code, use the lock
value as something that is relevant in term of ordering, you probably need
an explicit full barrier.

But the problem here is with spin_unlock_wait() (for ll/sc spin_lock) not seeing the
store that makes the lock visibly taken and both threads end up exiting out of sem_lock();
similar scenario to the spin_is_locked commit mentioned above, which is crossing of
locks.

Now that spin_unlock_wait() always implies at least an load-acquire barrier (for both
ticket and qspinlocks, which is still x86 only), we wait on the full critical region.

So this patch takes this locking scheme:

CPU0 CPU1
spin_lock(l) spin_lock(L)
spin_unlock_wait(L) if (spin_is_locked(l))
foo() foo()

... and converts it now to:

CPU0 CPU1
complex_mode = true spin_lock(l)
smp_mb() <--- do we want a smp_mb() here?
spin_unlock_wait(l) if (!smp_load_acquire(complex_mode))
foo() foo()

We should not be doing an smp_mb() right after a spin_lock(), makes no sense. The
spinlock machinery should guarantee us the barriers in the unorthodox locking cases,
such as this.

Thanks,
Davidlohr