Re: [RESEND RFC PATCH V3] sched: Improve scalability of select_idle_sibling using SMT balance

From: Mike Galbraith
Date: Mon Feb 05 2018 - 08:55:25 EST


On Mon, 2018-02-05 at 13:48 +0100, Peter Zijlstra wrote:
> On Fri, Feb 02, 2018 at 04:06:32PM -0500, Steven Sistare wrote:
> > On 2/2/2018 2:59 PM, Peter Zijlstra wrote:
>
> > > But then you get that atomic crud to contend on the cluster level, which
> > > is even worse than it contending on the core level.
> >
> > True, but it can still be a net win if we make better scheduling decisions.
> > A saving grace is that the atomic counter is only updated if the cpu
> > makes a transition from idle to busy or vice versa.
>
> Which can still be a very high rate for some workloads. I always forget
> which, but there are plenty workloads that have very frequenct very
> short idle times. Mike, do you remember what comes apart when we take
> out the sysctl_sched_migration_cost test in idle_balance()?

Used to be anything scheduling cross-core heftily suffered, ie pretty
much any localhost communication heavy load.  I just tried disabling it
in 4.13 though (pre pti cliff), tried tbench, and it made zip squat
difference.  I presume that's due to the meanwhile added this_rq->rd-
>overload and/or curr_cost checks.  I don't recall the original cost
details beyond it having been "a sh*tload".

-Mike