Re: sched: tweak select_idle_sibling to look for idle threads

From: Yuyang Du
Date: Tue May 10 2016 - 22:58:23 EST


On Tue, May 10, 2016 at 05:26:05PM +0200, Mike Galbraith wrote:
> On Tue, 2016-05-10 at 09:49 +0200, Mike Galbraith wrote:
>
> > Only whacking
> > cfs_rq_runnable_load_avg() with a rock makes schbench -m <sockets> -t
> > <near socket size> -a work well. 'Course a rock in its gearbox also
> > rendered load balancing fairly busted for the general case :)
>
> Smaller rock doesn't injure heavy tbench, but more importantly, still
> demonstrates the issue when you want full spread.
>
> schbench -m4 -t38 -a
>
> cputime 30000 threads 38 p99 177
> cputime 30000 threads 39 p99 10160
>
> LB_TIP_AVG_HIGH
> cputime 30000 threads 38 p99 193
> cputime 30000 threads 39 p99 184
> cputime 30000 threads 40 p99 203
> cputime 30000 threads 41 p99 202
> cputime 30000 threads 42 p99 205
> cputime 30000 threads 43 p99 218
> cputime 30000 threads 44 p99 237
> cputime 30000 threads 45 p99 245
> cputime 30000 threads 46 p99 262
> cputime 30000 threads 47 p99 296
> cputime 30000 threads 48 p99 3308
>
> 47*4+4=nr_cpus yay

yay... and haha, "a perfect world"...

> ---
> kernel/sched/fair.c | 3 +++
> kernel/sched/features.h | 1 +
> 2 files changed, 4 insertions(+)
>
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3027,6 +3027,9 @@ void remove_entity_load_avg(struct sched
>
> static inline unsigned long cfs_rq_runnable_load_avg(struct cfs_rq *cfs_rq)
> {
> + if (sched_feat(LB_TIP_AVG_HIGH) && cfs_rq->load.weight > cfs_rq->runnable_load_avg*2)
> + return cfs_rq->runnable_load_avg + min_t(unsigned long, NICE_0_LOAD,
> + cfs_rq->load.weight/2);
> return cfs_rq->runnable_load_avg;
> }

cfs_rq->runnable_load_avg is for sure no greater than (in this case much less
than, maybe 1/2 of) load.weight, whereas load_avg is not necessarily a rock
in gearbox that only impedes speed up, but also speed down.

But I really don't know the load references in select_task_rq() should be
what kind. So maybe the real issue is a mix of them, i.e., conflated balancing
and just wanting an idle cpu. ?