[RFC][PATCH] locking/mcs: Fix ordering for mcs_spin_lock()

From: Peter Zijlstra
Date: Mon Feb 01 2016 - 09:37:37 EST



Given the below patch; we've now got an unconditional full global
barrier in, does this make the MCS spinlock RCsc ?

The 'problem' is that this barrier can happen before we actually acquire
the lock. That is, if we hit arch_mcs_spin_lock_contended() _that_ will
be the acquire barrier and we end up with a SYNC in between unlock and
lock -- ie. not an smp_mb__after_unlock_lock() equivalent.



---
Subject: locking/mcs: Fix ordering for mcs_spin_lock()
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Mon Feb 1 15:11:28 CET 2016

Similar to commit b4b29f94856a ("locking/osq: Fix ordering of node
initialisation in osq_lock") the use of xchg_acquire() is
fundamentally broken with MCS like constructs.

Furthermore, it turns out we rely on the global transitivity of this
operation because the unlock path observes the pointer with a
READ_ONCE(), not an smp_load_acquire().

This is non-critical because the MCS code isn't actually used and
mostly serves as documentation, a stepping stone to the more complex
things we've build on top of the idea.

Cc: Will Deacon <will.deacon@xxxxxxx>
Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
Reported-by: Andrea Parri <parri.andrea@xxxxxxxxx>
Fixes: 3552a07a9c4a ("locking/mcs: Use acquire/release semantics")
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
---
kernel/locking/mcs_spinlock.h | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

--- a/kernel/locking/mcs_spinlock.h
+++ b/kernel/locking/mcs_spinlock.h
@@ -67,7 +67,13 @@ void mcs_spin_lock(struct mcs_spinlock *
node->locked = 0;
node->next = NULL;

- prev = xchg_acquire(lock, node);
+ /*
+ * We rely on the full barrier with global transitivity implied by the
+ * below xchg() to order the initialization stores above against any
+ * observation of @node. And to provide the ACQUIRE ordering associated
+ * with a LOCK primitive.
+ */
+ prev = xchg(lock, node);
if (likely(prev == NULL)) {
/*
* Lock acquired, don't need to set node->locked to 1. Threads