[PATCH] locking/mutex: Avoid spinner vs waiter starvation

From: Peter Zijlstra
Date: Mon Feb 01 2016 - 05:08:44 EST


On Sat, Jan 30, 2016 at 09:18:44AM +0800, Ding Tianhong wrote:
> On 2016/1/29 17:53, Peter Zijlstra wrote:
> > On Sun, Jan 24, 2016 at 04:03:50PM +0800, Ding Tianhong wrote:
> >
> >> looks good to me, I will try this solution and report the result, thanks everyone.
> >
> > Did you get a change to run with this?
> >
> > .
> >
>
> I backport this patch to 3.10 lts kernel, and didn't change any logic,
> Till now, the patch works fine to me, and no need to change anything,
> So I think this patch is no problem, could you formal release this
> patch to the latest kernel? :)

Thanks for testing, I've queued the below patch.

---
Subject: locking/mutex: Avoid spinner vs waiter starvation
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Fri, 22 Jan 2016 12:06:53 +0100

Ding Tianhong reported that under his load the optimistic spinners
would totally starve a task that ended up on the wait list.

Fix this by ensuring the top waiter also partakes in the optimistic
spin queue.

There are a few subtle differences between the assumed state of
regular optimistic spinners and those already on the wait list, which
result in the @acquired complication of the acquire path.

Most notable are:

- waiters are on the wait list and need to be taken off
- mutex_optimistic_spin() sets the lock->count to 0 on acquire
even though there might be more tasks on the wait list.

Cc: Jason Low <jason.low2@xxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Waiman Long <waiman.long@xxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
Cc: Davidlohr Bueso <dave@xxxxxxxxxxxx>
Cc: Will Deacon <Will.Deacon@xxxxxxx>
Reported-by: Ding Tianhong <dingtianhong@xxxxxxxxxx>
Tested-by: Ding Tianhong <dingtianhong@xxxxxxxxxx>
Tested-by: "Huang, Ying" <ying.huang@xxxxxxxxx>
Suggested-by: Waiman Long <Waiman.Long@xxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Link: http://lkml.kernel.org/r/20160122110653.GF6375@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
---
kernel/locking/mutex.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -512,6 +512,7 @@ __mutex_lock_common(struct mutex *lock,
struct task_struct *task = current;
struct mutex_waiter waiter;
unsigned long flags;
+ bool acquired;
int ret;

preempt_disable();
@@ -543,6 +544,7 @@ __mutex_lock_common(struct mutex *lock,
lock_contended(&lock->dep_map, ip);

for (;;) {
+ acquired = false;
/*
* Lets try to take the lock again - this is needed even if
* we get here for the first time (shortly after failing to
@@ -577,7 +579,16 @@ __mutex_lock_common(struct mutex *lock,
/* didn't get the lock, go to sleep: */
spin_unlock_mutex(&lock->wait_lock, flags);
schedule_preempt_disabled();
+
+ if (mutex_is_locked(lock))
+ acquired = mutex_optimistic_spin(lock, ww_ctx, use_ww_ctx);
+
spin_lock_mutex(&lock->wait_lock, flags);
+
+ if (acquired) {
+ atomic_set(&lock->count, -1);
+ break;
+ }
}
__set_task_state(task, TASK_RUNNING);

@@ -587,6 +598,9 @@ __mutex_lock_common(struct mutex *lock,
atomic_set(&lock->count, 0);
debug_mutex_free_waiter(&waiter);

+ if (acquired)
+ goto unlock;
+
skip_wait:
/* got the lock - cleanup and rejoice! */
lock_acquired(&lock->dep_map, ip);
@@ -597,6 +611,7 @@ __mutex_lock_common(struct mutex *lock,
ww_mutex_set_context_slowpath(ww, ww_ctx);
}

+unlock:
spin_unlock_mutex(&lock->wait_lock, flags);
preempt_enable();
return 0;