Re: [RFC PATCH v2 1/2] sched/fair: Introduce UTIL_FITS_CAPACITY feature (v2)

From: Chen Yu
Date: Wed Oct 25 2023 - 03:46:08 EST


On 2023-10-24 at 17:03:25 +0200, Dietmar Eggemann wrote:
> On 24/10/2023 08:10, Chen Yu wrote:
> > On 2023-10-23 at 11:04:49 -0400, Mathieu Desnoyers wrote:
> >> On 2023-10-23 10:11, Dietmar Eggemann wrote:
> >>> On 19/10/2023 18:05, Mathieu Desnoyers wrote:
>
> [...]
>
> >>> Or like find_energy_efficient_cpu() (feec(), used in
> >>> Energy-Aware-Scheduling (EAS)) which uses cpu_util(cpu, p, cpu, 0) to get:
> >>>
> >>> max(util_avg(CPU + p), util_est(CPU + p))
> >>
> >> I've tried using cpu_util(), but unfortunately anything that considers
> >> blocked/sleeping tasks in its utilization total does not work for my
> >> use-case.
> >>
> >> From cpu_util():
> >>
> >> * CPU utilization is the sum of running time of runnable tasks plus the
> >> * recent utilization of currently non-runnable tasks on that CPU.
> >>
> >
> > I thought cpu_util() indicates the utilization decay sum of task that was once
> > "running" on this CPU, but will not sum up the "util/load" of the blocked/sleeping
> > task?
>
> cpu_util() here refers to:
>
> cpu_util(int cpu, struct task_struct *p, int dst_cpu, int boost)
>
> which when called with (cpu, p, cpu, 0) and task_cpu(p) != cpu returns:
>
> max(util_avg(CPU + p), util_est(CPU + p))
>
> The term `CPU utilization` in cpu_util()'s header stands for
> cfs_rq->avg.util_avg.
>
> It does not sum up the utilization of blocked tasks but it can contain
> it. They have to be a blocked tasks and not tasks which were running in
> cfs_rq since we subtract utilization of tasks which are migrating away
> from the cfs_rq (cfs_rq->removed.util_avg in remove_entity_load_avg()
> and update_cfs_rq_load_avg()).

Thanks for this description in detail, Dietmar. Yes, I just realized that,
if the blocked tasks once ran on this cfs_rq and not being migrated away,
the cfs_rq's util_avg will contain those utils.

thanks,
Chenyu

> > accumulate_sum()
> > /* only the running task's util will be sum up */
> > if (running)
> > sa->util_sum += contrib << SCHED_CAPACITY_SHIFT;
> >
> > WRITE_ONCE(sa->util_avg, sa->util_sum / divider);
>
> __update_load_avg_cfs_rq()
>
> ___update_load_sum(..., cfs_rq->curr != NULL
> ^^^^^^^^^^^^^^^^^^^^
> running
> accumulate_sum()
>
> if (periods)
> /* decay _sum */
> sa->util_sum = decay_load(sa->util_sum, ...)
>
> if (load)
> /* decay and accrue _sum */
> contrib = __accumulate_pelt_segments(...)
>
> When crossing periods we decay the old _sum and when additionally load
> != 0 we decay and accrue the new _sum as well.