Re: [RFC][PATCH 4/4] futex: Rewrite FUTEX_UNLOCK_PI

From: Peter Zijlstra
Date: Mon Oct 03 2016 - 11:45:17 EST


On Mon, Oct 03, 2016 at 11:36:24AM -0400, Steven Rostedt wrote:
> > /*
> > - * If current does not own the pi_state then the futex is
> > - * inconsistent and user space fiddled with the futex value.
> > + * Now that we hold wait_lock, no new waiters can happen on the
> > + * rt_mutex and new owner is stable. Drop hb->lock.
> > */
> > - if (pi_state->owner != current)
> > - return -EINVAL;
> > + spin_unlock(&hb->lock);
> >
>
> Also, as Sebastian has said before, I believe this breaks rt's migrate
> disable code. As migrate disable and migrate_enable are a nop if
> preemption is disabled, thus if you hold a raw_spin_lock across a
> spin_unlock() when the migrate enable will be a nop, and the
> migrate_disable() will never stop.

Its too long since I looked at that trainwreck, but yuck, that would
make lock unlock order important :-(

Now I think we could do something like so.. but I'm not entirely sure on
the various lifetime rules here, its not overly documented.

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1300,15 +1300,14 @@ static int wake_futex_pi(u32 __user *uad
WAKE_Q(wake_q);
int ret = 0;

- raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);
-
WARN_ON_ONCE(!atomic_inc_not_zero(&pi_state->refcount));
/*
- * Now that we hold wait_lock, no new waiters can happen on the
- * rt_mutex and new owner is stable. Drop hb->lock.
+ * XXX
*/
spin_unlock(&hb->lock);

+ raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);
+
new_owner = rt_mutex_next_owner(&pi_state->pi_mutex);

/*