Re: [PATCH v5] locking/rwsem: Make handoff bit handling more consistent

From: Waiman Long
Date: Tue Apr 12 2022 - 13:04:20 EST


On 4/12/22 12:28, john.p.donnelly@xxxxxxxxxx wrote:
On 4/11/22 4:07 PM, Waiman Long wrote:

On 4/11/22 17:03, john.p.donnelly@xxxxxxxxxx wrote:


I have reached out to Waiman and he suggested this for our next test pass:


1ee326196c6658 locking/rwsem: Always try to wake waiters in out_nolock path

Does this commit help to avoid the lockup problem?

Commit 1ee326196c6658 fixes a potential missed wakeup problem when a reader first in the wait queue is interrupted out without acquiring the lock. It is actually not a fix for commit d257cc8cb8d5. However, this commit changes the out_nolock path behavior of writers by leaving the handoff bit set when the wait queue isn't empty. That likely makes the missed wakeup problem easier to reproduce.

Cheers,
Longman


Hi,


We are testing now

ETA for fio soak test completion is  ~15hr from now.

I wanted to share the stack traces for future reference + occurrences.

I am looking forward to your testing results tomorrow.

Cheers,
Longman

Hi

 Our 24hr fio soak test with :

 1ee326196c6658 locking/rwsem: Always try to wake waiters in out_nolock path


 applied to 5.15.30  passed.

 I suggest you append  1ee326196c6658 with :


 cc: stable

  Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent")


I'll leave the implementation details up to the core maintainers how to do that ;-)

Thanks for the test.

The patch has already been in the tip tree. It may not be easy to add a Fixes tag to it. Anyway, I will encourage stable tree maintainer to take it as it does fix a problem as shown in your test.

Cheers,
Longman