Re: [PATCH v3 1/7] asm-generic: ticket-lock: New generic ticket-based spinlock

From: Palmer Dabbelt
Date: Fri Apr 15 2022 - 12:46:34 EST


On Thu, 14 Apr 2022 18:27:12 PDT (-0700), longman@xxxxxxxxxx wrote:
On 4/14/22 18:02, Palmer Dabbelt wrote:
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>

This is a simple, fair spinlock. Specifically it doesn't have all the
subtle memory model dependencies that qspinlock has, which makes it more
suitable for simple systems as it is more likely to be correct. It is
implemented entirely in terms of standard atomics and thus works fine
without any arch-specific code.

This replaces the existing asm-generic/spinlock.h, which just errored
out on SMP systems.

Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Signed-off-by: Palmer Dabbelt <palmer@xxxxxxxxxxxx>
---
include/asm-generic/spinlock.h | 85 +++++++++++++++++++++++++---
include/asm-generic/spinlock_types.h | 17 ++++++
2 files changed, 94 insertions(+), 8 deletions(-)
create mode 100644 include/asm-generic/spinlock_types.h

diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index adaf6acab172..ca829fcb9672 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -1,12 +1,81 @@
/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef __ASM_GENERIC_SPINLOCK_H
-#define __ASM_GENERIC_SPINLOCK_H
+
/*
- * You need to implement asm/spinlock.h for SMP support. The generic
- * version does not handle SMP.
+ * 'Generic' ticket-lock implementation.
+ *
+ * It relies on atomic_fetch_add() having well defined forward progress
+ * guarantees under contention. If your architecture cannot provide this, stick
+ * to a test-and-set lock.
+ *
+ * It also relies on atomic_fetch_add() being safe vs smp_store_release() on a
+ * sub-word of the value. This is generally true for anything LL/SC although
+ * you'd be hard pressed to find anything useful in architecture specifications
+ * about this. If your architecture cannot do this you might be better off with
+ * a test-and-set.
+ *
+ * It further assumes atomic_*_release() + atomic_*_acquire() is RCpc and hence
+ * uses atomic_fetch_add() which is SC to create an RCsc lock.
+ *
+ * The implementation uses smp_cond_load_acquire() to spin, so if the
+ * architecture has WFE like instructions to sleep instead of poll for word
+ * modifications be sure to implement that (see ARM64 for example).
+ *
*/
-#ifdef CONFIG_SMP
-#error need an architecture specific asm/spinlock.h
-#endif

-#endif /* __ASM_GENERIC_SPINLOCK_H */
+#ifndef __ASM_GENERIC_TICKET_LOCK_H
+#define __ASM_GENERIC_TICKET_LOCK_H
It is not conventional to use a macro name that is different from the
header file name.

Sorry, that was just a mistake: I renamed the header, but forgot to rename the guard. I'll likely send a v4 due to Boqun's questions, I'll fix this as well.

+
+#include <linux/atomic.h>
+#include <asm-generic/spinlock_types.h>
+
+static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
+{
+ u32 val = atomic_fetch_add(1<<16, lock); /* SC, gives us RCsc */
+ u16 ticket = val >> 16;
+
+ if (ticket == (u16)val)
+ return;
+
+ atomic_cond_read_acquire(lock, ticket == (u16)VAL);
+}
+
+static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
+{
+ u32 old = atomic_read(lock);
+
+ if ((old >> 16) != (old & 0xffff))
+ return false;
+
+ return atomic_try_cmpxchg(lock, &old, old + (1<<16)); /* SC, for RCsc */
+}
+
+static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
+{
+ u16 *ptr = (u16 *)lock + IS_ENABLED(CONFIG_CPU_BIG_ENDIAN);
+ u32 val = atomic_read(lock);
+
+ smp_store_release(ptr, (u16)val + 1);
+}
+
+static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
+{
+ u32 val = atomic_read(lock);
+
+ return ((val >> 16) != (val & 0xffff));
+}
+
+static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
+{
+ u32 val = atomic_read(lock);
+
+ return (s16)((val >> 16) - (val & 0xffff)) > 1;
+}
+
+static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
+{
+ return !arch_spin_is_locked(&lock);
+}
+
+#include <asm/qrwlock.h>
+
+#endif /* __ASM_GENERIC_TICKET_LOCK_H */
diff --git a/include/asm-generic/spinlock_types.h b/include/asm-generic/spinlock_types.h
new file mode 100644
index 000000000000..e56ddb84d030
--- /dev/null
+++ b/include/asm-generic/spinlock_types.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ASM_GENERIC_TICKET_LOCK_TYPES_H
+#define __ASM_GENERIC_TICKET_LOCK_TYPES_H
+
+#include <linux/types.h>
+typedef atomic_t arch_spinlock_t;
+
+/*
+ * qrwlock_types depends on arch_spinlock_t, so we must typedef that before the
+ * include.
+ */
+#include <asm/qrwlock_types.h>

I believe that if you guard the include line by

#ifdef CONFIG_QUEUED_RWLOCK
#include <asm/qrwlock_types.h>
#endif

You may not need to do the hack in patch 5.

Yes, and we actually had it that way the first time around (specifically the ARCH_USES_QUEUED_RWLOCKS, but IIUC that's the same here). The goal was to avoid adding the ifdef to the asm-generic code and instead keep the oddness in arch/riscv, it's only there for that one commit (and just so we can split out the spinlock conversion from the rwlock conversion, in case there's a bug and these need to be bisected later).

I'd also considered renaming qrwlock* to rwlock*, which would avoid the ifdef and make it a touch easier to override the rwlock implementation, but that didn't seem useful enough to warrant the diff. These all seem a bit more coupled than I expected them to be (both {spin,qrw}lock{,_types}.h and the bits in linux/), I looked into cleaning that up a bit but it seemed like too much for just the one patch set.

You can also directly use the <asm-generic/qrwlock_types.h> line without
importing it to include/asm.

Yes, along with qrwlock.h (which has some unnecessary #include shims in a handful of arch dirs). That's going to make the patch set bigger, I'll include it in the v4.

Thanks!