Re: [PATCH] sched: Skip useless sched_balance_running acquisition if load balance is not due

From: Peter Zijlstra
Date: Fri Apr 18 2025 - 05:28:49 EST


On Fri, Apr 18, 2025 at 10:56:04AM +0530, K Prateek Nayak wrote:
> Hello Peter,
>
> On 4/17/2025 5:31 PM, Peter Zijlstra wrote:
> > > o Since this is a single flag across the entire system, it also implies
> > > CPUs cannon concurrently do load balancing across different NUMA
> > > domains which seems reasonable since a load balance at lower NUMA
> > > domain can potentially change the "nr_numa_running" and
> > > "nr_preferred_running" stats for the higher domain but if this is the
> > > case, a newidle balance at lower NUMA domain can interfere with a
> > > concurrent busy / newidle load balancing at higher NUMA domain.
> > > Is this expected? Should newidle balance be serialized too?
> >
> > Serializing new-idle might create too much idle time.
>
> In the context of busy and idle balancing, What are your thoughts on a
> per sd "serialize' flag?

My sekret hope is that this push stuff can rid us all the idle balance
bits. But yeah, early days on that.

Other than that, I don't quite see why we should split that, busy
balancing is the one that runs more often and is the one that should be
serialized to avoid too much cross node traffic and all that, no?

The idle thing is less often, why not limit that?