Re: [PATCH 1/1] [PATCH v3]sched/pelt: Fix the attach_entity_load_avg calculate method

From: Kuyo Chang
Date: Thu Apr 14 2022 - 23:56:42 EST


On Thu, 2022-04-14 at 16:44 +0200, Peter Zijlstra wrote:
> I've taken the liberty of carrying over the tags from v2 and reworked
> the Changelog a little.

Thank you for all your assistance.

> ---
> Subject: sched/pelt: Fix attach_entity_load_avg() corner case
> From: kuyo chang <kuyo.chang@xxxxxxxxxxxx>
> Date: Thu, 14 Apr 2022 17:02:20 +0800
>
> From: kuyo chang <kuyo.chang@xxxxxxxxxxxx>
>
> The warning in cfs_rq_is_decayed() triggered:
>
> SCHED_WARN_ON(cfs_rq->avg.load_avg ||
> cfs_rq->avg.util_avg ||
> cfs_rq->avg.runnable_avg)
>
> There exists a corner case in attach_entity_load_avg() which will
> cause load_sum to be zero while load_avg will not be.
>
> Consider se_weight is 88761 as per the sched_prio_to_weight[] table.
> Further assume the get_pelt_divider() is 47742, this gives:
> se->avg.load_avg is 1.
>
> However, calculating load_sum results in 0:
>
> se->avg.load_sum = div_u64(se->avg.load_avg * se->avg.load_sum,
> se_weight(se));
> se->avg.load_sum = 1*47742/88761 = 0.
>
> Then enqueue_load_avg() adds this to the cfs_rq totals:
>
> cfs_rq->avg.load_avg += se->avg.load_avg;
> cfs_rq->avg.load_sum += se_weight(se) * se->avg.load_sum;
>
> Resulting in load_avg being 1 with load_sum is 0, which will trigger
> the WARN.
>
> Fixes: f207934fb79d ("sched/fair: Align PELT windows between cfs_rq
> and its se")
> Signed-off-by: kuyo chang <kuyo.chang@xxxxxxxxxxxx>
> [peterz: massage changelog]
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> Reviewed-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> Tested-by: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
> Link:
> https://urldefense.com/v3/__https://lkml.kernel.org/r/20220414090229.342-1-kuyo.chang@mediatek.com__;!!CTRNKA9wMg0ARbw!35Im02xxIuUZdLYpPng37Yk7oVNJVJ1tfbu4XRzlq-6VhH3K29Por0gJCFlslT_CMgA$
>
> ---
> kernel/sched/fair.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3829,11 +3829,11 @@ static void attach_entity_load_avg(struc
>
> se->avg.runnable_sum = se->avg.runnable_avg * divider;
>
> - se->avg.load_sum = divider;
> - if (se_weight(se)) {
> - se->avg.load_sum =
> - div_u64(se->avg.load_avg * se->avg.load_sum,
> se_weight(se));
> - }
> + se->avg.load_sum = se->avg.load_avg * divider;
> + if (se_weight(se) < se->avg.load_sum)
> + se->avg.load_sum = div_u64(se->avg.load_sum,
> se_weight(se));
> + else
> + se->avg.load_sum = 1;
>
> enqueue_load_avg(cfs_rq, se);
> cfs_rq->avg.util_avg += se->avg.util_avg;