Re: spin_lock implicit/explicit memory barrier

From: Manfred Spraul
Date: Fri Aug 12 2016 - 14:44:12 EST


Hi Boqun,

On 08/12/2016 04:47 AM, Boqun Feng wrote:
We should not be doing an smp_mb() right after a spin_lock(), makes no sense. The
spinlock machinery should guarantee us the barriers in the unorthodox locking cases,
such as this.

Do we really want to go there?
Trying to handle all unorthodox cases will end up as an endless list of patches, and guaranteed to be stale architectures.

Right.

If you have:

6262db7c088b ("powerpc/spinlock: Fix spin_unlock_wait()")

you don't need smp_mb() after spin_lock() on PPC.

And, IIUC, if you have:

3a5facd09da8 ("arm64: spinlock: fix spin_unlock_wait for LSE atomics")
d86b8da04dfa ("arm64: spinlock: serialise spin_unlock_wait against
concurrent lockers")

you don't need smp_mb() after spin_lock() on ARM64.

And, IIUC, if you have:

2c6100227116 ("locking/qspinlock: Fix spin_unlock_wait() some more")

you don't need smp_mb() after spin_lock() on x86 with qspinlock.

I would really prefer the other approach:
- spin_lock() is an acquire, that's it. No further guarantees, e.g. ordering of writing the lock.
- spin_unlock() is a release, that's it.
- generic smp_mb__after_before_whatever(). And architectures can override the helpers.
E.g. if qspinlocks on x86 can implement the smp_mb__after_spin_lock() for free, then the helper can be a nop.

Right now, we start to hardcode something into the architectures - for some callers.
Other callers use solutions such as smp_mb__after_unlock_lock(), i.e. arch dependent workarounds in arch independent code.

And: We unnecessarily add overhead.
Both ipc/sem and netfilter do loops over many spinlocks:
for (i = 0; i < CONNTRACK_LOCKS; i++) {
spin_unlock_wait(&nf_conntrack_locks[i]);
}
One memory barrier would be sufficient, but due to embedding we end up with CONNTRACK_LOCKS barriers.

Should I create a patch?
(i.e. documentation and generic helpers)

--
Manfred