Re: [PATCH v2] sched/fair: fix case with reduced capacity CPU

From: Vincent Guittot
Date: Mon Jul 11 2022 - 12:42:42 EST


On Mon, 11 Jul 2022 at 18:03, Qais Yousef <qais.yousef@xxxxxxx> wrote:
>
> Hi Vincent
>
> On 07/08/22 17:44, Vincent Guittot wrote:
> > The capacity of the CPU available for CFS tasks can be reduced because of
> > other activities running on the latter. In such case, it's worth trying to
> > move CFS tasks on a CPU with more available capacity.
> >
> > The rework of the load balance has filtered the case when the CPU is
> > classified to be fully busy but its capacity is reduced.
> >
> > Check if CPU's capacity is reduced while gathering load balance statistic
> > and classify it group_misfit_task instead of group_fully_busy so we can
> > try to move the load on another CPU.
> >
> > Reported-by: David Chen <david.chen@xxxxxxxxxxx>
> > Reported-by: Zhang Qiao <zhangqiao22@xxxxxxxxxx>
> > Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> > Tested-by: David Chen <david.chen@xxxxxxxxxxx>
> > Tested-by: Zhang Qiao <zhangqiao22@xxxxxxxxxx>
> > ---
>
> [...]
>
> > @@ -8820,8 +8833,9 @@ static inline void update_sg_lb_stats(struct lb_env *env,
> >
> > for_each_cpu_and(i, sched_group_span(group), env->cpus) {
> > struct rq *rq = cpu_rq(i);
> > + unsigned long load = cpu_load(rq);
> >
> > - sgs->group_load += cpu_load(rq);
> > + sgs->group_load += load;
> > sgs->group_util += cpu_util_cfs(i);
> > sgs->group_runnable += cpu_runnable(rq);
> > sgs->sum_h_nr_running += rq->cfs.h_nr_running;
> > @@ -8851,11 +8865,17 @@ static inline void update_sg_lb_stats(struct lb_env *env,
> > if (local_group)
> > continue;
> >
> > - /* Check for a misfit task on the cpu */
> > - if (env->sd->flags & SD_ASYM_CPUCAPACITY &&
> > - sgs->group_misfit_task_load < rq->misfit_task_load) {
> > - sgs->group_misfit_task_load = rq->misfit_task_load;
> > - *sg_status |= SG_OVERLOAD;
> > + if (env->sd->flags & SD_ASYM_CPUCAPACITY) {
> > + /* Check for a misfit task on the cpu */
> > + if (sgs->group_misfit_task_load < rq->misfit_task_load) {
> > + sgs->group_misfit_task_load = rq->misfit_task_load;
> > + *sg_status |= SG_OVERLOAD;
> > + }
> > + } else if ((env->idle != CPU_NOT_IDLE) &&
> > + sched_reduced_capacity(rq, env->sd)) {
> > + /* Check for a task running on a CPU with reduced capacity */
> > + if (sgs->group_misfit_task_load < load)
> > + sgs->group_misfit_task_load = load;
> > }
> > }
>
> Small questions mostly for my education purposes.
>
> The new condition only applies for SMP systems. The reason asym systems don't
> care is because misfit check already considers capacity pressure when checking
> that the task fits_capacity()?

Yes

>
> It **seems** to me that the migration margin in fits_capacity() acts like the
> sd->imbalance_pct when check_cpu_capacity() is called by
> sched_reduced_capacity(), did I get it right?

Yes

>
> If I got it right, if the migration margin ever tweaked, could we potentially
> start seeing this kind of reported issue on asym systems then? I guess not. It
> just seems to me for asym systems tweaking the migration margin is similar to
> tweaking imbalance_pct for smp ones. But the subtlety is greater as
> imbalance_pct is still used in asym systems.

You should not because the task will end up being misfit whatever the
margin. The only change would be how fast you will detect and migrate


>
>
> Thanks
>
> --
> Qais Yousef