Re: [PATCH 1/7] sched: Introduce scale-invariant load tracking

From: Morten Rasmussen
Date: Wed Oct 08 2014 - 10:16:48 EST


On Wed, Oct 08, 2014 at 03:08:04PM +0100, Vincent Guittot wrote:
> On 8 October 2014 15:53, Morten Rasmussen <morten.rasmussen@xxxxxxx> wrote:
> > On Wed, Oct 08, 2014 at 12:21:45PM +0100, Vincent Guittot wrote:
> >> On 8 October 2014 13:00, Morten Rasmussen <morten.rasmussen@xxxxxxx> wrote:
>
> >> >
> >> > Sure. The easiest way to avoid introducing overflows is to ensure that
> >> > we always scale by a factor >= 1.0. That should be true as long as
> >> > arch_scale_{cpu,freq}_capacity() never returns anything greater than
> >> > SCHED_CAPACITY_SCALE (= 1024 = 1.0).
> >>
> >> the current ARM arch_scale_cpu is in the range [1536..0] which is free
> >> of overflow AFAICT
> >
> > If I'm not mistaken, that will cause an overflow in
> > __update_task_entity_contrib():
> >
> > static inline void __update_task_entity_contrib(struct sched_entity *se)
> > {
> > u32 contrib;
> > /* avoid overflowing a 32-bit type w/ SCHED_LOAD_SCALE */
> > contrib = se->avg.runnable_avg_sum * scale_load_down(se->load.weight);
> > contrib /= (se->avg.avg_period + 1);
> > se->avg.load_avg_contrib = scale_load(contrib);
> > }
> >
> > With arch_scale_cpu_capacity() > 1024 se->avg.runnable_avg_sum is no
> > longer bounded by LOAD_AVG_MAX = 47742. scale_load_down(se->load.weight)
> > == se->load.weight =< 88761.
> >
> > 47742 * 88761 = 4237627662 (2^32 = 4294967296)
> >
> > To avoid overflow se->avg.runnable_avg_sum must be less than 2^32/88761
> > = 48388, which means that arch_scale_cpu_capacity() must be in the range
> > 0..48388*1024/47742 = 0..1037.
> >
> > I also think it is easier to have a fixed defined max scaling factor,
> > but that might just be me.
>
> OK, overflow comes with adding uarch invariance into runnable load average
>
> >
> > Regarding the ARM arch_scale_cpu_capacity() implementation, I think that
> > can be changed to fit the 0..1024 range easily. Currently, it will only
> > report a non-default (1024) capacity for big.LITTLE systems and actually
> > enabling it (requires a certain property to be set in device tree) leads
> > to broken load-balancing decisions. We have discussed that several times
>
> Only the 1 task per CPU is broken but in the other hand, it better
> handles the overload use case where we have more tasks than CPU and
> other middle range use case by putting more task on big cluster.

Yes, agreed. My point was just to say that it shouldn't cause a lot of
harm changing the range of arch_scale_cpu_capacity() for ARM. We need to
fix things anyway.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/