Re: [PATCH] sched/fair: improve spreading of utilization

From: Vincent Guittot
Date: Fri Mar 13 2020 - 12:10:10 EST


On Fri, 13 Mar 2020 at 16:47, Valentin Schneider
<valentin.schneider@xxxxxxx> wrote:
>
>
> On Fri, Mar 13 2020, Vincent Guittot wrote:
> >> > And with more coffee that's another Doh, ASYM_PACKING would end up as
> >> > migrate_task. So this only affects the reduced capacity migration, which
> >>
> >> yes ASYM_PACKING uses migrate_task and the case of reduced capacity
> >> would use it too and would not be impacted by this patch. I say
> >> "would" because the original rework of load balance got rid of this
> >> case. I'm going to prepare a separate fix for this
> >
> > After more thought, I think that we are safe for reduced capacity too
> > because this is handled in the migrate_load case. In my previous
> > reply, I was thinking of the case where rq is not overloaded but cpu
> > has reduced capacity which is not handled. But in such case, we don't
> > have to force the migration of the task because there is still enough
> > capacity otherwise rq would be overloaded and we are back to the case
> > already handled
> >
>
> Good point on the capacity reduction vs group_is_overloaded.
>
> That said, can't we also reach this with migrate_task? Say the local

The test has only been added for migrate_util so migrate_task is not impacted

> group is entirely idle, and the busiest group has a few non-idle CPUs
> but they all have at most 1 running task. AFAICT we would still go to
> calculate_imbalance(), and try to balance out the number of idle CPUs.

such case is handled by migrate_task when we try to even the number of
tasks between groups

>
> If the migration_type is migrate_util, that can't happen because of this
> change. Since we have this progressive balancing strategy (tasks -> util
> -> load), it's a bit odd to have this "gap" in the middle where we get
> one less possibility to trigger active balance, don't you think? That
> is, providing I didn't say nonsense again :)

Right now, I can't think of a use case that could trigger such
situation because we use migrate_util when source is overloaded which
means that there is at least one waiting task and we favor this task
in priority

>
> It's not a super big deal, but I think it's nice if we can maintain a
> consistent / gradual migration policy.
>
> >>
> >> > might be hard to notice in benchmarks.