Re: [RFCv5 PATCH 25/46] sched: Add over-utilization/tipping point indicator

From: Morten Rasmussen
Date: Fri Oct 09 2015 - 08:49:45 EST


On Tue, Sep 29, 2015 at 01:08:46PM -0700, Steve Muckle wrote:
> On 08/14/2015 06:02 AM, Morten Rasmussen wrote:
> > To be sure not to break smp_nice, we have defined over-utilization as
> > when:
> >
> > cpu_rq(any)::cfs::avg::util_avg + margin > cpu_rq(any)::capacity
> >
> > is true for any cpu in the system. IOW, as soon as one cpu is (nearly)
> > 100% utilized, we switch to load_avg to factor in priority.
> >
> > Now with this definition, we can skip periodic load-balance as no cpu
> > has an always-running task when the system is not over-utilized. All
> > tasks will be periodic and we can balance them at wake-up. This
> > conservative condition does however mean that some scenarios that could
> > benefit from energy-aware decisions even if one cpu is fully utilized
> > would not get those benefits.
> >
> > For system where some cpus might have reduced capacity on some cpus
> > (RT-pressure and/or big.LITTLE), we want periodic load-balance checks as
> > soon a just a single cpu is fully utilized as it might one of those with
> > reduced capacity and in that case we want to migrate it.
> >
> > I haven't found any reasonably easy-to-track conditions that would work
> > better. Suggestions are very welcome.
>
> Workloads with a single heavy task and many small tasks are pretty
> common. I'm worried about the single heavy task tripping the
> over-utilization condition on a b.L system, EAS getting turned off, and
> small tasks running on big CPUs, leading to an increase in power
> consumption.
>
> Perhaps an extension to the over-utilization logic such as the following
> could cause big CPUs being saturated by a single task to be ignored?
>
> util(cpu X) + margin > capacity(cpu X) &&
> (capacity(cpu X) != max_capacity ? 1 : nr_running(cpu X) > 1)

I have had the same thought as well. I think it could work. nr_running()
doesn't take into account blocked tasks, so we could in theory see fewer
tasks than there is, but those scenarios are currently ignored by
load-balancing anyway and if the cpu is seriously over-utilized we are
quite likely to have nr_running() > 1.

I'm in favor of giving it a try and see what explodes :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/