Re: [PATCH 3/6] locking/rwsem: Rework writer wakeup

From: Waiman Long
Date: Mon Feb 27 2023 - 15:17:35 EST


On 2/27/23 05:31, Peter Zijlstra wrote:
I do have some concern that early lock transfer to a lock owner that has not
been woken up yet may suppress writer lock stealing from optimistic spinning
causing some performance regression in some cases. Let's see if the test
robot report anything.
Ah yes, I suppose that is indeed a possibility. Given this is all under
wait_lock and the spinner is not, I was hoping it would still have
sufficient time to win. But yes, robots will tell us.

I run my rwsem locking microbenchmark on a 2-socket 96-thread x86-64
system with lock event turned on for 15 secs.

Before this patchset:

Running locktest with rwsem [runtime = 15s, r% = 50%, load = 100]
Threads = 96, Min/Mean/Max = 74,506/91,260/112,409
Threads = 96, Total Rate = 584,091 op/s; Percpu Rate = 6,084 op/s

rwsem_opt_fail=127305
rwsem_opt_lock=4252147
rwsem_opt_nospin=28920
rwsem_rlock=2713129
rwsem_rlock_fail=0
rwsem_rlock_fast=5
rwsem_rlock_handoff=280
rwsem_rlock_steal=1486617
rwsem_sleep_reader=2713085
rwsem_sleep_writer=4313369
rwsem_wake_reader=29876
rwsem_wake_writer=5829160
rwsem_wlock=127305
rwsem_wlock_fail=0
rwsem_wlock_handoff=2515

After this patchset:

Running locktest with rwsem [runtime = 15s, r% = 50%, load = 100]
Threads = 96, Min/Mean/Max = 26,573/26,749/26,833
Threads = 96, Total Rate = 171,184 op/s; Percpu Rate = 1,783 op/s

rwsem_opt_fail=1265481
rwsem_opt_lock=17939
rwsem_rlock=1266157
rwsem_rlock_fail=0
rwsem_rlock_fast=0
rwsem_rlock_handoff=0
rwsem_rlock_steal=551
rwsem_sleep_reader=1266157
rwsem_sleep_writer=1265481
rwsem_wake_reader=26612
rwsem_wake_writer=0
rwsem_wlock=1265481
rwsem_wlock_ehandoff=94
rwsem_wlock_fail=0
rwsem_wlock_handoff=94

So the locking rate is reduced to just 29.3% of the original. Looking at
the number of successful writer lock stealings from optimistic spinning
(rwsem_opt_lock), it is reduced from 4252147 to 17939. It is just about
0.4% of the original.

So for workloads that have a lot of writer contention, there will be
performance regressions. Do you mind if we try to keep the original
logic of my patchset to allow write lock acquisition in writer slow
path, but transfer the lock ownership in the wakeup path when handoff
is required. We can do this with some minor code changes on top of your
current patchset.

Regards,
Longman