Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes

From: Peter Zijlstra
Date: Wed Nov 23 2016 - 10:19:33 EST


On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai Hähnle wrote:
> @@ -473,7 +476,14 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock)
> */
> mutex_clear_owner(&lock->base);
> #endif
> - __mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
> + /*
> + * A previously _not_ waiting task may acquire the lock via the fast
> + * path during our unlock. In that case, already waiting tasks may have
> + * to back off to avoid a deadlock. Wake up all waiters so that they
> + * can check their acquire context stamp against the new owner.
> + */
> + __mutex_fastpath_unlock(&lock->base.count,
> + __mutex_unlock_slowpath_wakeall);
> }

So doing a wake-all has obvious issues with thundering herd etc.. Also,
with the new mutex, you'd not be able to do hand-off, which would
introduce starvation cases.

Ideally we'd iterate the blocked list and pick the waiter with the
earliest stamp, or we'd maintain the list in stamp order instead of
FIFO, for ww_mutex.