[PATCH v2] locking/rwsem: Take read lock immediate if queue empty with no writer

From: Waiman Long
Date: Fri Jul 13 2018 - 14:31:06 EST


It was discovered that a constant stream of readers might cause the
count to go negative most of the time after an initial trigger by a
writer even if no writer was present afterward. As a result, most of the
readers would have to go through the slowpath reducing their performance.

To avoid that from happening, an additional check is added to detect
the special case that the reader in the critical section is the only
one in the wait queue and no writer is present. When that happens, it
can just have the lock and return immediately without further action.
Other incoming readers won't see a waiter is present and be forced
into the slowpath.

The additional code is in the slowpath and so should not have an impact
on rwsem performance. However, in the special case listed above, it may
greatly improve performance.

The issue was found in a customer site where they had an application
that pounded on the pread64 syscalls heavily on an XFS filesystem. The
application was run in a recent 4-socket boxes with a lot of CPUs. They
saw significant spinlock contention in the rwsem_down_read_failed() call.
With this patch applied, the system CPU usage went from 85% to 57%,
and the spinlock contention in the pread64 syscalls was gone.

v2: Add customer testing results and remove wording that may cause
confusion.

Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
---
kernel/locking/rwsem-xadd.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 3064c50..bf0570e 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -233,8 +233,19 @@ static void __rwsem_mark_wake(struct rw_semaphore *sem,
waiter.type = RWSEM_WAITING_FOR_READ;

raw_spin_lock_irq(&sem->wait_lock);
- if (list_empty(&sem->wait_list))
+ if (list_empty(&sem->wait_list)) {
+ /*
+ * In the unlikely event that the task is the only one in
+ * the wait queue and a writer isn't present, it can have
+ * the lock and return immediately without going through
+ * the remaining slowpath code.
+ */
+ if (unlikely(atomic_long_read(&sem->count) >= 0)) {
+ raw_spin_unlock_irq(&sem->wait_lock);
+ return sem;
+ }
adjustment += RWSEM_WAITING_BIAS;
+ }
list_add_tail(&waiter.list, &sem->wait_list);

/* we're now waiting on the lock, but no longer actively locking */
--
1.8.3.1