Re: [PATCH v6 5/6] locking/rwsem: Enable direct rwsem lock handoff

From: Waiman Long
Date: Mon Jan 23 2023 - 17:10:00 EST


On 1/23/23 12:30, Waiman Long wrote:
I will update the patch description to highlight the points that I discussed in this email.

I am planning to update the patch description to as follows:

    The lock handoff provided in rwsem isn't a true handoff like that in
    the mutex. Instead, it is more like a quiescent state where optimistic
    spinning and lock stealing are disabled to make it easier for the first
    waiter to acquire the lock.

    For mutex, lock handoff is done at unlock time as the owner value and
    the handoff bit is in the same lock word and can be updated atomically.

    That is the not case for rwsem which has a separate count value for
    locking and an owner value. The only way to update them in a quasi-atomic
    way is to use the wait_lock for synchronization as the handoff bit can
    only be updated while holding the wait_lock. So for rwsem, the new
    lock handoff mechanism is done mostly at rwsem_wake() time when the
    wait_lock has to be acquired anyway to minimize additional overhead.

    Passing the count value at unlock time down to rwsem_wake() to determine
    if handoff should be done is not safe as the waiter that set the
    RWSEM_FLAG_HANDOFF bit may have been interrupted out or killed in the
    interim. So we need to recheck the count value again after taking the
    wait_lock. If there is an active lock, we can't perform the handoff
    even if the handoff bit is set at both the unlock and rwsem_wake()
    times. It is because there is a slight possibility that the original
    waiter that set the handoff bit may have bailed out followed by a read
    lock and then the handoff bit is set by another waiter.

    It is also likely that the active lock in this case may be a transient
    RWSEM_READER_BIAS that will be removed soon. So we have a secondary
    handoff done at reader slow path to handle this particular case.

    For reader-owned rwsem, the owner value other than the RWSEM_READER_OWNED
    bit is mostly for debugging purpose only. So it is not safe to use
    the owner value to confirm a handoff to a reader has happened. On the
    other hand, we can do that when handing off to a writer. However, it
    is simpler to use the same mechanism to notify a handoff has happened
    for both readers and writers. So a new HANDOFF_GRANTED state is added
    to enum rwsem_handoff_state to signify that. This new value will be
    written to the handoff_state value of the first waiter.

    With true lock handoff, there is no need to do a NULL owner spinning
    anymore as wakeup will be performed if handoff is successful. So it
    is likely that the first waiter won't actually go to sleep even when
    schedule() is called in this case.

Please let me know what you think.

Cheers,
Longman