Re: [PATCH 1/7] sched: Introduce scale-invariant load tracking

From: Vincent Guittot
Date: Wed Oct 08 2014 - 10:08:32 EST


On 8 October 2014 15:53, Morten Rasmussen <morten.rasmussen@xxxxxxx> wrote:
> On Wed, Oct 08, 2014 at 12:21:45PM +0100, Vincent Guittot wrote:
>> On 8 October 2014 13:00, Morten Rasmussen <morten.rasmussen@xxxxxxx> wrote:

>> >
>> > Sure. The easiest way to avoid introducing overflows is to ensure that
>> > we always scale by a factor >= 1.0. That should be true as long as
>> > arch_scale_{cpu,freq}_capacity() never returns anything greater than
>> > SCHED_CAPACITY_SCALE (= 1024 = 1.0).
>>
>> the current ARM arch_scale_cpu is in the range [1536..0] which is free
>> of overflow AFAICT
>
> If I'm not mistaken, that will cause an overflow in
> __update_task_entity_contrib():
>
> static inline void __update_task_entity_contrib(struct sched_entity *se)
> {
> u32 contrib;
> /* avoid overflowing a 32-bit type w/ SCHED_LOAD_SCALE */
> contrib = se->avg.runnable_avg_sum * scale_load_down(se->load.weight);
> contrib /= (se->avg.avg_period + 1);
> se->avg.load_avg_contrib = scale_load(contrib);
> }
>
> With arch_scale_cpu_capacity() > 1024 se->avg.runnable_avg_sum is no
> longer bounded by LOAD_AVG_MAX = 47742. scale_load_down(se->load.weight)
> == se->load.weight =< 88761.
>
> 47742 * 88761 = 4237627662 (2^32 = 4294967296)
>
> To avoid overflow se->avg.runnable_avg_sum must be less than 2^32/88761
> = 48388, which means that arch_scale_cpu_capacity() must be in the range
> 0..48388*1024/47742 = 0..1037.
>
> I also think it is easier to have a fixed defined max scaling factor,
> but that might just be me.

OK, overflow comes with adding uarch invariance into runnable load average

>
> Regarding the ARM arch_scale_cpu_capacity() implementation, I think that
> can be changed to fit the 0..1024 range easily. Currently, it will only
> report a non-default (1024) capacity for big.LITTLE systems and actually
> enabling it (requires a certain property to be set in device tree) leads
> to broken load-balancing decisions. We have discussed that several times

Only the 1 task per CPU is broken but in the other hand, it better
handles the overload use case where we have more tasks than CPU and
other middle range use case by putting more task on big cluster.

> in the past. I wouldn't recommend enabling it until the load-balance
> code can deal with big.LITTLE compute capacities correctly. This is also
> why it isn't implemented by ARM64.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/