[tip: sched/core] sched/fair: Fix detection of per-CPU kthreads waking a task

From: tip-bot2 for Vincent Donnefort
Date: Mon Dec 06 2021 - 10:49:54 EST


The following commit has been merged into the sched/core branch of tip:

Commit-ID: 8b4e74ccb582797f6f0b0a50372ebd9fd2372a27
Gitweb: https://git.kernel.org/tip/8b4e74ccb582797f6f0b0a50372ebd9fd2372a27
Author: Vincent Donnefort <vincent.donnefort@xxxxxxx>
AuthorDate: Wed, 01 Dec 2021 14:34:50
Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
CommitterDate: Sat, 04 Dec 2021 10:56:20 +01:00

sched/fair: Fix detection of per-CPU kthreads waking a task

select_idle_sibling() has a special case for tasks woken up by a per-CPU
kthread, where the selected CPU is the previous one. However, the current
condition for this exit path is incomplete. A task can wake up from an
interrupt context (e.g. hrtimer), while a per-CPU kthread is running. A
such scenario would spuriously trigger the special case described above.
Also, a recent change made the idle task like a regular per-CPU kthread,
hence making that situation more likely to happen
(is_per_cpu_kthread(swapper) being true now).

Checking for task context makes sure select_idle_sibling() will not
interpret a wake up from any other context as a wake up by a per-CPU
kthread.

Fixes: 52262ee567ad ("sched/fair: Allow a per-CPU kthread waking a task to stack on the same CPU, to fix XFS performance regression")
Signed-off-by: Vincent Donnefort <vincent.donnefort@xxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Reviewed-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Reviewed-by: Valentin Schneider <valentin.schneider@xxxxxxx>
Link: https://lore.kernel.org/r/20211201143450.479472-1-vincent.donnefort@xxxxxxx
---
kernel/sched/fair.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 884f29d..5cd2798 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6398,6 +6398,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
* pattern is IO completions.
*/
if (is_per_cpu_kthread(current) &&
+ in_task() &&
prev == smp_processor_id() &&
this_rq()->nr_running <= 1) {
return prev;