Re: locking/rwsem: RT throttling issue due to RT task hogging the cpu

From: Mukesh Ojha
Date: Mon Sep 26 2022 - 10:12:31 EST


Hi,

Any comments on this issue would be helpful.

Thanks,
Mukesh

On 9/20/2022 9:49 PM, Mukesh Ojha wrote:
Hi,

We are observing one issue where, sem->owner is not set and sem->count=6 [1] which means both RWSEM_FLAG_WAITERS and RWSEM_FLAG_HANDOFF bits are set. And if unfold the sem->wait_list we see the following order of process waiting  [2] where [a] is waiting for write, while [b],[c] are waiting for read and [d] is the RT task for which waiter.handoff_set=true and it is continuously running on cpu7 and not letting the first write waiter [a] on cpu7.

[1]

  sem = 0xFFFFFFD57DDC6680 -> (
    count = (counter = 6),
    owner = (counter = 0),

[2]

[a] kworker/7:0 pid: 32516 ==> [b] iptables-restor pid: 18625 ==> [c]HwBinder:1544_3  pid: 2024 ==> [d] RenderEngine pid: 2032 cpu: 7 prio:97 (RT task)


Sometime back, Waiman has suggested this which could help in RT task
leaving the cpu.

https://lore.kernel.org/all/8c33f989-8870-08c6-db12-521de634b34e@xxxxxxxxxx/

--------------------------------->O----------------------------

From c6493edd7a5e4f597ea55ff0eb3f1d763b335dfc Mon Sep 17 00:00:00 2001
  2 From: Waiman Long <longman@xxxxxxxxxx>
  3 Date: Tue, 20 Sep 2022 20:50:45 +0530
  4 Subject: [PATCH] locking/rwsem: Yield the cpu after doing handoff optimistic
  5  spinning
  6
  7 It is possible the new lock owner (writer) can be preempted before setting
  8 the owner field and if the current(e.g RT task) waiter is the task that
  9 preempts the new lock owner, it will hand_off spin loop for a long time.
 10 Avoid wasting cpu time and delaying the release of the lock by yielding
 11 the cpu if handoff optimistic spinning has been done multiple times with
 12 NULL owner.
 13
 14 Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
 15 Signed-off-by: Mukesh Ojha <quic_mojha@xxxxxxxxxxx>
 16 ---
 17  kernel/locking/rwsem.c | 15 ++++++++++++++-
 18  1 file changed, 14 insertions(+), 1 deletion(-)
 19
 20 diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
 21 index 65f0262..a875758 100644
 22 --- a/kernel/locking/rwsem.c
 23 +++ b/kernel/locking/rwsem.c
 24 @@ -361,6 +361,8 @@ enum rwsem_wake_type {
 25   */
 26  #define MAX_READERS_WAKEUP     0x100
 27
 28 +#define MAX_HANDOFF_SPIN       10
 29 +
 30  static inline void
 31  rwsem_add_waiter(struct rw_semaphore *sem, struct rwsem_waiter *waiter)
 32  {
 33 @@ -1109,6 +1111,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
 34  {
 35         struct rwsem_waiter waiter;
 36         DEFINE_WAKE_Q(wake_q);
 37 +       int handoff_spins = 0;
 38
 39         /* do optimistic spinning and steal lock if possible */
 40         if (rwsem_can_spin_on_owner(sem) && rwsem_optimistic_spin(sem)) {
 41 @@ -1167,6 +1170,14 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
 42                  * has just released the lock, OWNER_NULL will be returned.
 43                  * In this case, we attempt to acquire the lock again
 44                  * without sleeping.
 45 +                *
 46 +                * It is possible the new lock owner (writer) can be preempted
 47 +                * before setting the owner field and if the current(e.g RT task)
 48 +                * waiter is the task that preempts the new lock owner, it will
 49 +                * spin in this loop for a long time. Avoid wasting cpu time
 50 +                * and delaying the release of the lock by yielding the cpu if
 51 +                * handoff optimistic spinning has been done multiple times with
 52 +                * NULL owner.
 53                  */
 54                 if (waiter.handoff_set) {
 55                         enum owner_state owner_state;
 56 @@ -1175,8 +1186,10 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
 57                         owner_state = rwsem_spin_on_owner(sem);
 58                         preempt_enable();
 59
 60 -                       if (owner_state == OWNER_NULL)
 61 +                       if ((owner_state == OWNER_NULL) && (handoff_spins < MAX_HANDOFF_SPIN)) {
 62 +                               handoff_spins++;
 63                                 goto trylock_again;
 64 +                       }
 65                 }
 66
 67                 schedule();
 68 --
 69 2.7.4
 70


-Mukesh