Re: [PATCH 2/3] sched/fair: Prevent active LB from preempting higher sched classes

From: Qais Yousef
Date: Thu Aug 08 2019 - 05:25:02 EST


On 08/07/19 18:40, Valentin Schneider wrote:
> The CFS load balancer can cause the cpu_stopper to run a function to
> try and steal a rq's currently running task. However, it so happens
> that while only CFS tasks will ever be migrated by that function, we
> can end up preempting higher sched class tasks, since it is executed
> by the cpu_stopper.
>
> I don't expect this to be exceedingly common: we still need to have
> had a busiest group in the first place, which needs
>
> busiest->sum_nr_running != 0
>
> which is a cfs.h_nr_running sum, so we should have something to pull,
> but if we fail to pull anything and the remote rq is executing
> an RT/DL task we can hit this.
>
> Add an extra check to not trigger the cpu_stopper if the remote
> rq's running task isn't CFS.
>
> Signed-off-by: Valentin Schneider <valentin.schneider@xxxxxxx>
> ---
> kernel/sched/fair.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index b56b8edee3d3..79bd6ead589c 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8834,6 +8834,10 @@ static inline enum alb_status active_load_balance(struct lb_env *env)
>
> raw_spin_lock_irqsave(&busiest->lock, flags);
>
> + /* Make sure we're not about to stop a task from a higher sched class */
> + if (busiest->curr->sched_class != &fair_sched_class)
> + goto unlock;
> +

This looks correct to me, but I wonder if this check is something that belongs
to the CONFIG_PREEMPT_RT land. This will give a preference to not disrupt the
RT/DL tasks which is certainly the desired behavior there, but maybe in none
PREEMPT_RT world balancing CFS tasks is more important? Hmmm

--
Qais Yousef

> /*
> * Don't kick the active_load_balance_cpu_stop, if the curr task on
> * busiest CPU can't be moved to dst_cpu:
> --
> 2.22.0
>