[PATCH] sched/fair: Fix the nohz.next_balance update mess

From: Wanpeng Li
Date: Sun Feb 05 2017 - 05:04:27 EST


From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>

The commit:
c5afb6a87f2 ("sched/fair: Fix nohz.next_balance update")

intends to update nohz.next_balance in two steps.

1) The ILB CPU utilizes next_balance variable in nohz_idle_balance()
to gather the shortest next balance of other idle CPUs before
updating nohz.next_balance.
2) The ILB CPU updates the nohz.next_balance according to its own
next_balance after load balance on behalf of other idle CPUs.

However, there is a mess which breaks the original intention of the
first step, every idle CPUs update nohz.next_balance during ILB CPU
on behalf of them to do load balance, and then the ILB CPU utilizes
next_balance variable in nohz_idle_balance() to gather the shortest
next balance of other idle CPUs before updating nohz.next_balance.

This patch fixes it by don't update nohz.next_balance for other idle
CPUs when ILB CPU on behalf of them to do load balance.

Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Cc: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
CC: Ingo Molnar <mingo@xxxxxxxxxx>
Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
---
kernel/sched/fair.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 274c747..83948a4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8750,7 +8750,8 @@ static void rebalance_domains(struct rq *rq, enum cpu_idle_type idle)
* balance for itself and we need to update the
* nohz.next_balance accordingly.
*/
- if ((idle == CPU_IDLE) && time_after(nohz.next_balance, rq->next_balance))
+ if ((idle == CPU_IDLE) && time_after(nohz.next_balance, rq->next_balance) &&
+ !test_bit(NOHZ_BALANCE_KICK, nohz_flags(this_rq()->cpu)))
nohz.next_balance = rq->next_balance;
#endif
}
--
2.7.4