[tip:locking/core] locking/pvstat: Separate wait_again and spurious wakeup stats

From: tip-bot for Waiman Long
Date: Wed Aug 10 2016 - 14:07:50 EST


Commit-ID: 08be8f63c40c030b5cf95b4368e314e563a86301
Gitweb: http://git.kernel.org/tip/08be8f63c40c030b5cf95b4368e314e563a86301
Author: Waiman Long <Waiman.Long@xxxxxxx>
AuthorDate: Tue, 31 May 2016 12:53:47 -0400
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Wed, 10 Aug 2016 14:16:02 +0200

locking/pvstat: Separate wait_again and spurious wakeup stats

Currently there are overlap in the pvqspinlock wait_again and
spurious_wakeup stat counters. Because of lock stealing, it is
no longer possible to accurately determine if spurious wakeup has
happened in the queue head. As they track both the queue node and
queue head status, it is also hard to tell how many of those comes
from the queue head and how many from the queue node.

This patch changes the accounting rules so that spurious wakeup is
only tracked in the queue node. The wait_again count, however, is
only tracked in the queue head when the vCPU failed to acquire the
lock after a vCPU kick. This should give a much better indication of
the wait-kick dynamics in the queue node and the queue head.

Signed-off-by: Waiman Long <Waiman.Long@xxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Boqun Feng <boqun.feng@xxxxxxxxx>
Cc: Douglas Hatch <doug.hatch@xxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Pan Xinhui <xinhui@xxxxxxxxxxxxxxxxxx>
Cc: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Scott J Norton <scott.norton@xxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Link: http://lkml.kernel.org/r/1464713631-1066-2-git-send-email-Waiman.Long@xxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
kernel/locking/qspinlock_paravirt.h | 12 +++---------
kernel/locking/qspinlock_stat.h | 4 ++--
2 files changed, 5 insertions(+), 11 deletions(-)

diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
index 429c3dc..3acf16d 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -288,12 +288,10 @@ static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
{
struct pv_node *pn = (struct pv_node *)node;
struct pv_node *pp = (struct pv_node *)prev;
- int waitcnt = 0;
int loop;
bool wait_early;

- /* waitcnt processing will be compiled out if !QUEUED_LOCK_STAT */
- for (;; waitcnt++) {
+ for (;;) {
for (wait_early = false, loop = SPIN_THRESHOLD; loop; loop--) {
if (READ_ONCE(node->locked))
return;
@@ -317,7 +315,6 @@ static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)

if (!READ_ONCE(node->locked)) {
qstat_inc(qstat_pv_wait_node, true);
- qstat_inc(qstat_pv_wait_again, waitcnt);
qstat_inc(qstat_pv_wait_early, wait_early);
pv_wait(&pn->state, vcpu_halted);
}
@@ -458,12 +455,9 @@ pv_wait_head_or_lock(struct qspinlock *lock, struct mcs_spinlock *node)
pv_wait(&l->locked, _Q_SLOW_VAL);

/*
- * The unlocker should have freed the lock before kicking the
- * CPU. So if the lock is still not free, it is a spurious
- * wakeup or another vCPU has stolen the lock. The current
- * vCPU should spin again.
+ * Because of lock stealing, the queue head vCPU may not be
+ * able to acquire the lock before it has to wait again.
*/
- qstat_inc(qstat_pv_spurious_wakeup, READ_ONCE(l->locked));
}

/*
diff --git a/kernel/locking/qspinlock_stat.h b/kernel/locking/qspinlock_stat.h
index b9d0315..eb0a599 100644
--- a/kernel/locking/qspinlock_stat.h
+++ b/kernel/locking/qspinlock_stat.h
@@ -24,8 +24,8 @@
* pv_latency_wake - average latency (ns) from vCPU kick to wakeup
* pv_lock_slowpath - # of locking operations via the slowpath
* pv_lock_stealing - # of lock stealing operations
- * pv_spurious_wakeup - # of spurious wakeups
- * pv_wait_again - # of vCPU wait's that happened after a vCPU kick
+ * pv_spurious_wakeup - # of spurious wakeups in non-head vCPUs
+ * pv_wait_again - # of wait's after a queue head vCPU kick
* pv_wait_early - # of early vCPU wait's
* pv_wait_head - # of vCPU wait's at the queue head
* pv_wait_node - # of vCPU wait's at a non-head queue node