Re: [PATCH 3/6] locking/rwsem: Rework writer wakeup

From: Peter Zijlstra
Date: Mon Mar 20 2023 - 04:12:57 EST


On Mon, Feb 27, 2023 at 03:16:25PM -0500, Waiman Long wrote:
> On 2/27/23 05:31, Peter Zijlstra wrote:
> > > I do have some concern that early lock transfer to a lock owner that has not
> > > been woken up yet may suppress writer lock stealing from optimistic spinning
> > > causing some performance regression in some cases. Let's see if the test
> > > robot report anything.
> > Ah yes, I suppose that is indeed a possibility. Given this is all under
> > wait_lock and the spinner is not, I was hoping it would still have
> > sufficient time to win. But yes, robots will tell us.
> >
> I run my rwsem locking microbenchmark on a 2-socket 96-thread x86-64
> system with lock event turned on for 15 secs.
>
> Before this patchset:
>
> Running locktest with rwsem [runtime = 15s, r% = 50%, load = 100]
> Threads = 96, Min/Mean/Max = 74,506/91,260/112,409
> Threads = 96, Total Rate = 584,091 op/s; Percpu Rate = 6,084 op/s
>
> rwsem_opt_fail=127305
> rwsem_opt_lock=4252147
> rwsem_opt_nospin=28920
> rwsem_rlock=2713129
> rwsem_rlock_fail=0
> rwsem_rlock_fast=5
> rwsem_rlock_handoff=280
> rwsem_rlock_steal=1486617
> rwsem_sleep_reader=2713085
> rwsem_sleep_writer=4313369
> rwsem_wake_reader=29876
> rwsem_wake_writer=5829160
> rwsem_wlock=127305
> rwsem_wlock_fail=0
> rwsem_wlock_handoff=2515
>
> After this patchset:
>
> Running locktest with rwsem [runtime = 15s, r% = 50%, load = 100]
> Threads = 96, Min/Mean/Max = 26,573/26,749/26,833
> Threads = 96, Total Rate = 171,184 op/s; Percpu Rate = 1,783 op/s
>
> rwsem_opt_fail=1265481
> rwsem_opt_lock=17939
> rwsem_rlock=1266157
> rwsem_rlock_fail=0
> rwsem_rlock_fast=0
> rwsem_rlock_handoff=0
> rwsem_rlock_steal=551
> rwsem_sleep_reader=1266157
> rwsem_sleep_writer=1265481
> rwsem_wake_reader=26612
> rwsem_wake_writer=0
> rwsem_wlock=1265481
> rwsem_wlock_ehandoff=94
> rwsem_wlock_fail=0
> rwsem_wlock_handoff=94
>
> So the locking rate is reduced to just 29.3% of the original. Looking at
> the number of successful writer lock stealings from optimistic spinning
> (rwsem_opt_lock), it is reduced from 4252147 to 17939. It is just about
> 0.4% of the original.
>
> So for workloads that have a lot of writer contention, there will be
> performance regressions. Do you mind if we try to keep the original
> logic of my patchset to allow write lock acquisition in writer slow
> path, but transfer the lock ownership in the wakeup path when handoff
> is required. We can do this with some minor code changes on top of your
> current patchset.

Urgh, sorry, I seem to have lost sight of this... those results,..
sadness :/

Yeah, I suppose there's nothing for it but to have live with that mess,
be very sure to add comments eludicating any future poor sod reading it
as to why the code is the way it is.