[PATCH 2/3] locking: Fix comment for smp_mb__after_spinlock()

From: Andrea Parri
Date: Wed Jun 27 2018 - 04:53:30 EST


Boqun reported that the snippet described in the header comment for
smp_mb__after_spinlock() can be misleading, because acquire/release
chains already provide us with the underlying guarantee (due to the
A-cumulativity of release).

This commit fixes the comment following Boqun's example in [1].

It's worth noting here that LKMM maintainers are currently actively
debating whether to enforce RCsc transitivity of locking operations
"by definition" [2]; should the guarantee be enforced in the future,
the requirement for smp_mb__after_spinlock() could be simplified to
include only the STORE->LOAD ordering requirement.

[1] http://lkml.kernel.org/r/20180312085600.aczjkpn73axzs2sb@tardis
[2] http://lkml.kernel.org/r/Pine.LNX.4.44L0.1711271553490.1424-100000@xxxxxxxxxxxxxxxxxxxx
http://lkml.kernel.org/r/Pine.LNX.4.44L0.1806211322160.2381-100000@xxxxxxxxxxxxxxxxxxxx

Reported-and-Suggested-by: Boqun Feng <<boqun.feng@xxxxxxxxx>
Signed-off-by: Andrea Parri <andrea.parri@xxxxxxxxxxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Will Deacon <will.deacon@xxxxxxx>
Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
---
include/linux/spinlock.h | 26 +++++++++++++-------------
1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
index 1e8a464358384..c74828fe8d75c 100644
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -121,22 +121,22 @@ do { \
*
* - it must ensure the critical section is RCsc.
*
- * The latter is important for cases where we observe values written by other
- * CPUs in spin-loops, without barriers, while being subject to scheduling.
+ * The latter requirement guarantees that stores from two critical sections
+ * in different CPUs are ordered even outside the critical sections. As an
+ * example illustrating this property, consider the following snippet:
*
- * CPU0 CPU1 CPU2
+ * CPU0 CPU1 CPU2
*
- * for (;;) {
- * if (READ_ONCE(X))
- * break;
- * }
- * X=1
- * <sched-out>
- * <sched-in>
- * r = X;
+ * spin_lock(s); spin_lock(s); r2 = READ_ONCE(Y);
+ * WRITE_ONCE(X, 1); smp_mb__after_spinlock(); smp_rmb();
+ * spin_unlock(s); r1 = READ_ONCE(X); r3 = READ_ONCE(X);
+ * WRITE_ONCE(Y, 1);
+ * spin_unlock(s);
*
- * without transitivity it could be that CPU1 observes X!=0 breaks the loop,
- * we get migrated and CPU2 sees X==0.
+ * Without RCsc transitivity, it is allowed that CPU0's critical section
+ * precedes CPU1's critical section (r1=1) and that CPU2 observes CPU1's
+ * store to Y (r2=1) while it does not observe CPU0's store to X (r3=0),
+ * despite the presence of the smp_rmb().
*
* Since most load-store architectures implement ACQUIRE with an smp_mb() after
* the LL/SC loop, they need no further barriers. Similarly all our TSO
--
2.7.4