Re: [PATCH v3] locking/rtmutex: Limit # of lock stealing for non-RT waiters

From: Waiman Long
Date: Mon Jul 11 2022 - 16:02:31 EST


On 7/11/22 05:34, Peter Zijlstra wrote:
On Wed, Jul 06, 2022 at 09:59:16AM -0400, Waiman Long wrote:
Commit 48eb3f4fcfd3 ("locking/rtmutex: Implement equal priority lock
stealing") allows unlimited number of lock stealing's for non-RT
tasks. That can lead to lock starvation of non-RT top waiter tasks if
there is a constant incoming stream of non-RT lockers. This can cause
rcu_preempt self-detected stall or even task lockup in PREEMPT_RT kernel.
For example,

[77107.424943] rcu: INFO: rcu_preempt self-detected stall on CPU
[ 1249.921363] INFO: task systemd:2178 blocked for more than 622 seconds.

Avoiding this problem and ensuring forward progress by limiting the
number of times that a lock can be stolen from each waiter. This patch
sets a threshold of 32. That number is arbitrary and can be changed
if needed.

Why not do the same thing we do for regular mutexes?

The mutex way is another possible alternative. So we can set a flag to disable lock stealing if the current top waiter wake up and the rtmutex has been stolen. I will need to run some tests to find out how many time lock stealing can happen before it is blocked. I would like to allow sufficient number of lock stealing to minimize the performance impact of this change.

Cheers,
Longman