Re: [PATCH] sched: fix incorrect PELT values on SMT

From: Dietmar Eggemann
Date: Fri Aug 19 2016 - 11:01:05 EST


Hi Steve,

On 19/08/16 02:55, Steve Muckle wrote:
> PELT scales its util_sum and util_avg values via
> arch_scale_cpu_capacity(). If that function is passed the CPU's sched
> domain then it will reduce the scaling capacity if SD_SHARE_CPUCAPACITY
> is set. PELT does not pass in the sd however. The other caller of
> arch_scale_cpu_capacity, update_cpu_capacity(), does. This means
> util_sum and util_avg scale beyond the CPU capacity on SMT.
>
> On an Intel i7-3630QM for example rq->cpu_capacity_orig is 589 but
> util_avg scales up to 1024.
>
> Fix this by passing in the sd in __update_load_avg() as well.
>
> Signed-off-by: Steve Muckle <smuckle@xxxxxxxxxx>
> ---
> kernel/sched/fair.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 61d485421bed..95d34b337152 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2731,7 +2731,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
> sa->last_update_time = now;
>
> scale_freq = arch_scale_freq_capacity(NULL, cpu);
> - scale_cpu = arch_scale_cpu_capacity(NULL, cpu);
> + scale_cpu = arch_scale_cpu_capacity(cpu_rq(cpu)->sd, cpu);

Wouldn't you have to subscribe to this rcu pointer rq->sd w/ something
like 'rcu_dereference(cpu_rq(cpu)->sd)'?

IMHO, __update_load_avg() is called outside existing RCU read-side
critical sections as well so there would be a pair of
rcu_read_lock()/rcu_read_unlock() required in this case.

>
> /* delta_w is the amount already accumulated against our next period */
> delta_w = sa->period_contrib;
>