Re: [PATCH v3 2/5] locking/rwsem: Limit # of null owner retries for handoff writer
From: Waiman Long
Date: Mon Oct 24 2022 - 13:24:59 EST
On 10/24/22 09:31, Peter Zijlstra wrote:
On Mon, Oct 17, 2022 at 05:13:53PM -0400, Waiman Long wrote:
Commit 91d2a812dfb9 ("locking/rwsem: Make handoff writer optimistically
spin on owner") assumes that when the owner field is changed to NULL,
the lock will become free soon. That assumption may not be correct
especially if the handoff writer doing the spinning is a RT task which
may preempt another task from completing its action of either freeing
the rwsem or properly setting up owner.
I'm confused again -- rwsem_*_owner() has
lockdep_assert_preemption_disabled(). But more specifically; why can the
RT task preempt a lock-op like that?
There is a special case raised by Mukesh that can happen. I quoted his
text here:
---------------------------
Looks like, There is still a window for a race.
There is a chance when a reader who came first added it's BIAS and goes
to slowpath and before it gets added to wait list it got preempted by RT
task which goes to slowpath as well and being the first waiter gets its
hand-off bit set and not able to get the lock due to following condition
in rwsem_try_write_lock()
630 if (count & RWSEM_LOCK_MASK) { ==> reader has
sets its bias
..
...
634
635 new |= RWSEM_FLAG_HANDOFF;
636 } else {
637 new |= RWSEM_WRITER_LOCKED;
---------------------->----------------------->-------------------------
First reader (1) writer(2) RT task Lock holder(3)
It sets
RWSEM_READER_BIAS.
while it is going to
slowpath(as the lock
was held by (3)) and
before it got added
to the waiters list
it got preempted
by (2).
RT task also takes
the slowpath and add release the
itself into waiting list rwsem lock
and since it is the first clear the
it is the next one to get owner.
the lock but it can not
get the lock as (count &
RWSEM_LOCK_MASK) is set
as (1) has added it but
not able to remove its
adjustment.
----------------------
To fix that we either has to disable preemption in down_read() and
reenable it in rwsem_down_read_slowpath after decrementing the
RWSEM_READER_BIAS or to limit the number of trylock-spinning attempt
like this patch. The latter approach seems a bit less messy and I am
going to take it back out anyway in patch 4. I will put a summary of
that special case in the patch description.
Cheers,
Longman