Re: [patch] sched/fair: Use instantaneous load in wakeup paths

From: Mike Galbraith
Date: Fri Jun 17 2016 - 09:57:21 EST


On Fri, 2016-06-17 at 11:55 +0100, Dietmar Eggemann wrote:
> On 17/06/16 07:21, Mike Galbraith wrote:
> > Here are some schbench runs on an 8x8 box to show that longish
> > run/sleep period corner I mentioned.
> >
> > vogelweide:~/:[1]# for i in `seq 5`; do schbench -m 8 -t 1 -a -r 10 2>&1 | grep 'threads 8'; done
> > cputime 30000 threads 8 p99 68
> > cputime 30000 threads 8 p99 46
> > cputime 30000 threads 8 p99 46
> > cputime 30000 threads 8 p99 45
> > cputime 30000 threads 8 p99 49
> > vogelweide:~/:[0]# echo NO_WAKE_INSTANTANEOUS_LOAD > /sys/kernel/debug/sched_features
> > vogelweide:~/:[0]# for i in `seq 5`; do schbench -m 8 -t 1 -a -r 10 2>&1 | grep 'threads 8'; done
> > cputime 30000 threads 8 p99 9968
> > cputime 30000 threads 8 p99 10224
> > vogelweide:~/:[0]#
> >
>
> Is this the influence of wake_affine using instantaneous load now too or
> did you set SD_BALANCE_WAKE on sd's or both?

It's likely just the fork bits, I didn't turn on SD_BALANCE_WAKE.
>
> > Using instantaneous load, we fill the box every time, without, we stack
> > every time. This was with Peter's select_idle_sibling() rewrite
> > applied as well, but you can see that it does matter.
> >
> > That doesn't mean I think my patch should immediately fly upstream
> > 'course, who knows, there may be a less messy way to deal with it, or,
> > as already stated, maybe it just doesn't matter enough to the real
> > world to even bother with.
>
> IMHO, if it would be possible to get rid of sd->wake_idx,
> sd->forkexec_idx, the implementation would be less messy. Is there
> anyone changing these values to something other that the default 0?

Dunno.

Doesn't matter much until we answer the question are the numbers we're
using good enough, or are they not. Hackbench and schbench say we can
certainly distribute load better by looking at the real deal instead of
a ball of fuzz (a scheduler dust monster;), but how long have we been
doing that, and how many real world complaints do we have?

The schbench thing is based on a real world load, but the real world
complaint isn't the fork distribution thing that schbench demonstrates,
that's a periodic load corner, not the we're waking to busy CPUs while
there are idle CPUs available that Facebook is griping about. So we
have zero real world complaints, we have hackbench moving because the
ball of fuzz got reshaped, and we have the bumpy spot that schbench
hits with or without the bugfix that caused hackbench to twitch.

-Mike