Re: [RFC PATCH 1/2] sched: Clean up SD_BALANCE_WAKE flags in sched domain build-up

From: Yuyang Du
Date: Thu Jun 02 2016 - 02:38:26 EST


On Thu, Jun 02, 2016 at 07:50:23AM +0200, Mike Galbraith wrote:
> > > Nope, those two have different meanings. We pass SD_BALANCE_WAKE to
> > > identify a ttwu() wakeup, just as we pass SD_BALANCE_FORK to say we're
> > > waking a child. SD_WAKE_AFFINE means exactly what it says, but is only
> > > applicable to ttwu() wakeups.
> >
> > I don't disagree, but want to add that, SD_WAKE_AFFINE has no meaning that is so
> > special and so important for anyone to use the flag to tune anything. If you want
> > to do any SD_BALANCE_*, waker CPU is a valid candidate if allowed, that is it.
>
> That flag lets the user specifically tell us that he doesn't want us to
> bounce his tasks around the box, cache misses be damned. The user may
> _know_ that say cross node migrations hurt his load more than help, and
> not want us to do that, thus expresses himself by turning the flag off
> at whatever level. People do that. You can force them to take other
> measures, but why do that?

Agreed, and with this patch, just disable SD_BALANCE_WAKE.

> > IIUC your XXX mark and your comment "Prefer wake_affine over balance flags", you
> > said the same thing: SD_WAKE_AFFINE should be part of SD_BALANCE_WAKE, and should
> > be part of all SD_BALANCE_* flags,
>
> Peter wrote that, but I don't read it the way you do. I read as if the
> user wants the benefits of affine wakeups, he surely doesn't want us to
> send the wakee off to god know where on every wakeup _instead_ of
> waking affine, he wants to balance iff he can't have an affine wakeup.

That is another matter within SD_BALANCE_WAKE we may further define: how
much effort to scan or how frequent bouncing etc the user wants. This is now
defined by SD_WAKE_AFFINE flag, which I certainly don't think is good.

> > > If wake_wide() says we do not want an affine wakeup, but you apply
> > > SD_WAKE_AFFINE meaning to SD_BALANCE_WAKE and turn it on in ->flags,
> > > we'll give the user a free sample of full balance cost, no?
> >
> > Yes, and otherwise we don't select anything? That is just bad engough whether worse
> > or not. So the whole fuss I made is really that this is a right thing to start with. :)
>
> Nope, leaving tasks where they were is not a bad thing. Lots of stuff
> likes the scheduler best when it leaves them the hell alone :) That
> works out well all around, balance cycles are spent in userspace
> instead, scheduler produces wins by doing nothing, perfect.
>
Again, agreed, and with this patch, just disable SD_BALANCE_WAKE. :)