[tip: sched/urgent] sched: Don't run cpu-online with balance_push() enabled

From: tip-bot2 for Peter Zijlstra
Date: Fri Jan 22 2021 - 12:57:13 EST


The following commit has been merged into the sched/urgent branch of tip:

Commit-ID: 22f667c97aadbf481e2cae2d6feabdf431e27b31
Gitweb: https://git.kernel.org/tip/22f667c97aadbf481e2cae2d6feabdf431e27b31
Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
AuthorDate: Fri, 15 Jan 2021 18:17:45 +01:00
Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
CommitterDate: Fri, 22 Jan 2021 15:09:42 +01:00

sched: Don't run cpu-online with balance_push() enabled

We don't need to push away tasks when we come online, mark the push
complete right before the CPU dies.

XXX hotplug state machine has trouble with rollback here.

Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Reviewed-by: Valentin Schneider <valentin.schneider@xxxxxxx>
Tested-by: Valentin Schneider <valentin.schneider@xxxxxxx>
Link: https://lkml.kernel.org/r/20210121103506.415606087@xxxxxxxxxxxxx
---
kernel/sched/core.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 627534f..8da0fd7 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7320,10 +7320,12 @@ static void balance_push_set(int cpu, bool on)
struct rq_flags rf;

rq_lock_irqsave(rq, &rf);
- if (on)
+ if (on) {
+ WARN_ON_ONCE(rq->balance_callback);
rq->balance_callback = &balance_push_callback;
- else
+ } else if (rq->balance_callback == &balance_push_callback) {
rq->balance_callback = NULL;
+ }
rq_unlock_irqrestore(rq, &rf);
}

@@ -7441,6 +7443,10 @@ int sched_cpu_activate(unsigned int cpu)
struct rq *rq = cpu_rq(cpu);
struct rq_flags rf;

+ /*
+ * Make sure that when the hotplug state machine does a roll-back
+ * we clear balance_push. Ideally that would happen earlier...
+ */
balance_push_set(cpu, false);

#ifdef CONFIG_SCHED_SMT
@@ -7608,6 +7614,12 @@ int sched_cpu_dying(unsigned int cpu)
}
rq_unlock_irqrestore(rq, &rf);

+ /*
+ * Now that the CPU is offline, make sure we're welcome
+ * to new tasks once we come back up.
+ */
+ balance_push_set(cpu, false);
+
calc_load_migrate(rq);
update_max_interval();
nohz_balance_exit_idle(rq);