[PATCH RT v2] futex/rtmutex: Cure RT double blocking issue

From: Sebastian Sewior
Date: Thu May 11 2017 - 11:21:22 EST


RT has a problem when the wait on a futex/rtmutex got interrupted by a
timeout or a signal. task->pi_blocked_on is still set when returning from
rt_mutex_wait_proxy_lock(). The task must acquire the hash bucket lock
after this.

If the hash bucket lock is contended then the
BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in
task_blocks_on_rt_mutex() will trigger.

This can be avoided by clearing task->pi_blocked_on in the return path of
rt_mutex_wait_proxy_lock() which removes the task from the boosting chain
of the rtmutex. That's correct because the task is not longer blocked on
it.

Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Reported-by: Engleder Gerhard <eg@xxxxxxxx>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
---
v1âv2: reset ->pi_blocked_on only in the error case.

kernel/locking/rtmutex.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 314fc65a35b1..4675f1197f33 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -2400,6 +2400,7 @@ int rt_mutex_wait_proxy_lock(struct rt_mutex *lock,
struct hrtimer_sleeper *to,
struct rt_mutex_waiter *waiter)
{
+ struct task_struct *tsk = current;
int ret;

raw_spin_lock_irq(&lock->wait_lock);
@@ -2409,6 +2410,24 @@ int rt_mutex_wait_proxy_lock(struct rt_mutex *lock,
/* sleep on the mutex */
ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter, NULL);

+ /*
+ * RT has a problem here when the wait got interrupted by a timeout
+ * or a signal. task->pi_blocked_on is still set. The task must
+ * acquire the hash bucket lock when returning from this function.
+ *
+ * If the hash bucket lock is contended then the
+ * BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in
+ * task_blocks_on_rt_mutex() will trigger. This can be avoided by
+ * clearing task->pi_blocked_on which removes the task from the
+ * boosting chain of the rtmutex. That's correct because the task
+ * is not longer blocked on it.
+ */
+ if (ret) {
+ raw_spin_lock(&tsk->pi_lock);
+ tsk->pi_blocked_on = NULL;
+ raw_spin_unlock(&tsk->pi_lock);
+ }
+
raw_spin_unlock_irq(&lock->wait_lock);

return ret;
--
2.11.0