Re: [RFC PATCH v4 0/6] sched/cpufreq: Make schedutil energy aware

From: Peter Zijlstra
Date: Mon Feb 10 2020 - 08:31:34 EST


On Thu, Jan 23, 2020 at 05:16:52PM +0000, Douglas Raillard wrote:
> Hi Rafael,
>
> On 1/23/20 3:43 PM, Rafael J. Wysocki wrote:
> > On Wed, Jan 22, 2020 at 6:36 PM Douglas RAILLARD
> > <douglas.raillard@xxxxxxx> wrote:
> >>
> >> Make schedutil cpufreq governor energy-aware.
> >
> > I have to say that your terminology is confusing to me, like what
> > exactly does "energy-aware" mean in the first place?
>
> Should be better rephrased as "Make schedutil cpufreq governor use the
> energy model" I guess. Schedutil is indeed already energy aware since it
> tries to use the lowest frequency possible for the job to be done (kind of).

So ARM64 will soon get x86-like power management if I read these here
patches right:

https://lkml.kernel.org/r/20191218182607.21607-2-ionela.voinescu@xxxxxxx

And I'm thinking a part of Rafael's concerns will also apply to those
platforms.

> Other than that, the only energy-related information schedutil uses is
> the assumption that lower freq == better efficiency. Explicit use of the
> EM allows to refine this assumption.

I'm thinking that such platforms guarantee this on their own, if not,
there just isn't anything we can do about it, so that assumption is
fair.

(I've always found it weird to have less efficient OPPs listed anyway)

> >> 1) Selecting the highest possible frequency for a given cost. Some
> >> platforms can have lower frequencies that are less efficient than
> >> higher ones, in which case they should be skipped for most purposes.
> >> They can still be useful to give more freedom to thermal throttling
> >> mechanisms, but not under normal circumstances.
> >> note: the EM framework will warn about such OPPs "hertz/watts ratio
> >> non-monotonically decreasing"
> >
> > While all of that is fair enough for platforms using the EM, do you
> > realize that the EM is not available on the majority of architectures
> > (including some fairly significant ones) and so adding overhead
> > related to it for all of them is quite less than welcome?
>
> When CONFIG_ENERGY_MODEL is not defined, em_pd_get_higher_freq() is
> defined to a static inline no-op function, so that feature won't incur
> overhead (patch 1+2+3).
>
> Patch 4 and 5 do add some new logic that could be used on any platform.
> Current code will use the boost as an energy margin, but it would be
> straightforward to make a util-based version (like iowait boost) on
> non-EM platforms.

Right, so the condition 'util_avg > util_est' makes sense to trigger
some sort of boost off of.

What kind would make sense for these platforms? One possibility would be
to instead of frobbing the energy margin, as you do here, to frob the C
in get_next_freq().

(I have vague memories of this being proposed earlier; it also avoids
that double OPP iteration thing complained about elsewhere in this
thread if I'm not mistaken).


That is; I'm thinking it is important (esp. now that we got frequency
invariance sorted for x86), to have this patch also work for !EM
architectures (as those ARM64-AMU things would be).