On Fri, May 03, 2024 at 10:41:06PM -0400, Waiman Long wrote:I purposely doesn't touch the fast path and add code only to slow path to make sure there shouldn't be any noticeable performance impact.
Not much of a fan of this thing, but I suppose it won't hurt too much ...
Using bit field will be more expensive. So I am deferring that until the time it becomes necessary.diff --git a/kernel/locking/mcs_spinlock.h b/kernel/locking/mcs_spinlock.hStrictly speaking, count shouldn't ever go much higher than 4, so I
index 85251d8771d9..cbe6f07dff2e 100644
--- a/kernel/locking/mcs_spinlock.h
+++ b/kernel/locking/mcs_spinlock.h
@@ -13,12 +13,19 @@
#ifndef __LINUX_MCS_SPINLOCK_H
#define __LINUX_MCS_SPINLOCK_H
+/*
+ * Save an encoded version of the current MCS lock owner CPU to the
+ * mcs_spinlock structure of the next lock owner.
+ */
+#define MCS_LOCKED (smp_processor_id() + 1)
+
#include <asm/mcs_spinlock.h>
struct mcs_spinlock {
struct mcs_spinlock *next;
- int locked; /* 1 if lock acquired */
- int count; /* nesting count, see qspinlock.c */
+ int locked; /* non-zero if lock acquired */
+ short count; /* nesting count, see qspinlock.c */
+ short prev_node; /* encoded previous node value */
suppose a byte should be sufficient. That would then leave 24bit for the
prev thing, but you'll get to use bitfields or some other horrible
thing.
};This leaves ARM up a creek I suppose... Also, there's but a single
#ifndef arch_mcs_spin_lock_contended
@@ -42,7 +49,7 @@ do { \
* unlocking.
*/
#define arch_mcs_spin_unlock_contended(l) \
- smp_store_release((l), 1)
+ smp_store_release((l), MCS_LOCKED)
MCS_LOCKED user.