[PATCH] locking, rwsem: Fix down_write_killable()

From: Peter Zijlstra
Date: Thu May 12 2016 - 07:58:01 EST


On Wed, May 11, 2016 at 08:03:46PM +0200, Michal Hocko wrote:
> On Wed 11-05-16 15:59:38, Michal Hocko wrote:
> > On Wed 11-05-16 11:41:28, Peter Zijlstra wrote:
> > > On Wed, May 11, 2016 at 11:31:27AM +0200, Michal Hocko wrote:
> > >
> > > > Care to cook up a full patch?
> > >
> > > compile tested only, if someone could please test it?
> >
> > I have tried to run the test case from Tetsuo[1] with a small printk to
> > show the interrupted writer case:
> > [ 2753.596678] XXX: Writer interrupted. Woken waiters:0
> > [ 2998.266978] XXX: Writer interrupted. Woken waiters:0
> >
> > which means rwsem_atomic_update(-RWSEM_WAITING_BIAS, sem) path which is
> > the problematic case. oom_reaper was always able to succeed so I guess
> > the patch works as expected. I will leave the test run for longer to be
> > sure.
>
> And just for the reference I am able to reproduce the lockup without the
> patch applied and the same test case and a debugging patch
>
> [ 1522.036379] XXX interrupted. list_is_singular:1
> [ 1523.040462] oom_reaper: unable to reap pid:3736 (tgid=3736)

Here the complete patch. Ingo could you make it appear in the right tip
branch?


---
Subject: locking, rwsem: Fix down_write_killable()
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Wed, 11 May 2016 11:41:28 +0200

The new signal_pending exit path in __rwsem_down_write_failed_common()
was fingered as breaking his kernel by Tetsuo Handa.

Upon inspection it was found that there are two things wrong with it;

- it forgets to remove WAITING_BIAS if it leaves the list empty, or
- it forgets to wake further waiters that were blocked on the now
removed waiter.

Especially the first issue causes new lock attempts to block and stall
indefinitely, as the code assumes that pending waiters mean there is
an owner that will wake when it releases the lock.

Cc: "David S. Miller" <davem@xxxxxxxxxxxxx>
Cc: Waiman Long <Waiman.Long@xxxxxxx>
Cc: Chris Zankel <chris@xxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Davidlohr Bueso <dave@xxxxxxxxxxxx>
Cc: Tony Luck <tony.luck@xxxxxxxxx>
Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
Cc: Max Filippov <jcmvbkbc@xxxxxxxxx>
Reported-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Tested-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Tested-by: Michal Hocko <mhocko@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
---
kernel/locking/rwsem-xadd.c | 21 +++++++++++++++------
1 file changed, 15 insertions(+), 6 deletions(-)

--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -487,23 +487,32 @@ __rwsem_down_write_failed_common(struct

/* Block until there are no active lockers. */
do {
- if (signal_pending_state(state, current)) {
- raw_spin_lock_irq(&sem->wait_lock);
- ret = ERR_PTR(-EINTR);
- goto out;
- }
+ if (signal_pending_state(state, current))
+ goto out_nolock;
+
schedule();
set_current_state(state);
} while ((count = sem->count) & RWSEM_ACTIVE_MASK);

raw_spin_lock_irq(&sem->wait_lock);
}
-out:
__set_current_state(TASK_RUNNING);
list_del(&waiter.list);
raw_spin_unlock_irq(&sem->wait_lock);

return ret;
+
+out_nolock:
+ __set_current_state(TASK_RUNNING);
+ raw_spin_lock_irq(&sem->wait_lock);
+ list_del(&waiter.list);
+ if (list_empty(&sem->wait_list))
+ rwsem_atomic_update(-RWSEM_WAITING_BIAS, sem);
+ else
+ __rwsem_do_wake(sem, RWSEM_WAKE_ANY);
+ raw_spin_unlock_irq(&sem->wait_lock);
+
+ return ERR_PTR(-EINTR);
}

__visible struct rw_semaphore * __sched