Re: [PATCH v3] locking/rtmutex: Limit # of lock stealing for non-RT waiters

From: Waiman Long
Date: Wed Jul 06 2022 - 10:03:34 EST


On 7/6/22 09:59, Waiman Long wrote:
Commit 48eb3f4fcfd3 ("locking/rtmutex: Implement equal priority lock
stealing") allows unlimited number of lock stealing's for non-RT
tasks. That can lead to lock starvation of non-RT top waiter tasks if
there is a constant incoming stream of non-RT lockers. This can cause
rcu_preempt self-detected stall or even task lockup in PREEMPT_RT kernel.
For example,

[77107.424943] rcu: INFO: rcu_preempt self-detected stall on CPU
[ 1249.921363] INFO: task systemd:2178 blocked for more than 622 seconds.

Avoiding this problem and ensuring forward progress by limiting the
number of times that a lock can be stolen from each waiter. This patch
sets a threshold of 32. That number is arbitrary and can be changed
if needed.

Fixes: 48eb3f4fcfd3 ("locking/rtmutex: Implement equal priority lock stealing")
Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
---
kernel/locking/rtmutex.c | 9 ++++++---
kernel/locking/rtmutex_common.h | 8 ++++++++
2 files changed, 14 insertions(+), 3 deletions(-)

[v3: Increase threshold to 32 and add rcu_preempt self-detected stall]

Note that I decided to increase the threshold to 32 from 10 to reduce the potential performance impact of this change, if any. We also found out that this patch can fix some of the rcu_preempt self-detected stall problems that we saw with the PREEMPT_RT kernel. So I added that information in the patch description.

Cheers,
Longman