Re: [PATCH] rwsem: reduce spinlock contention in wakeup code path

From: Waiman Long
Date: Fri Sep 27 2013 - 20:47:09 EST


On 09/27/2013 03:32 PM, Peter Hurley wrote:
On 09/27/2013 03:00 PM, Waiman Long wrote:
With the 3.12-rc2 kernel, there is sizable spinlock contention on
the rwsem wakeup code path when running AIM7's high_systime workload
on a 8-socket 80-core DL980 (HT off) as reported by perf:

7.64% reaim [kernel.kallsyms] [k] _raw_spin_lock_irqsave
|--41.77%-- rwsem_wake
1.61% reaim [kernel.kallsyms] [k] _raw_spin_lock_irq
|--92.37%-- rwsem_down_write_failed

That was 4.7% of recorded CPU cycles.

On a large NUMA machine, it is entirely possible that a fairly large
number of threads are queuing up in the ticket spinlock queue to do
the wakeup operation. In fact, only one will be needed. This patch
tries to reduce spinlock contention by doing just that.

A new wakeup field is added to the rwsem structure. This field is
set on entry to rwsem_wake() and __rwsem_do_wake() to mark that a
thread is pending to do the wakeup call. It is cleared on exit from
those functions.

By checking if the wakeup flag is set, a thread can exit rwsem_wake()
immediately if another thread is pending to do the wakeup instead of
waiting to get the spinlock and find out that nothing need to be done.

This will leave readers stranded if a former writer is in __rwsem_do_wake
to wake up the readers and another writer steals the lock, but before
the former writer exits without having woken up the readers, the locking
stealing writer drops the lock and sees the wakeup flag is set, so
doesn't bother to wake the readers.

Regards,
Peter Hurley


Yes, you are right. That can be a problem. Thank for pointing this out. The workloads that I used doesn't seem to exercise the readers. I will modify the patch to add code handle this failure case by resetting the wakeup flag, pushing it out and then retrying one more time to get the read lock. I think that should address the problem.

Regards,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/