Re: [PATCH v5 10/14] sched/cpufreq: Refactor the utilization aggregation method

From: Quentin Perret
Date: Thu Aug 02 2018 - 12:04:55 EST


On Thursday 02 Aug 2018 at 15:04:40 (+0200), Peter Zijlstra wrote:
> On Wed, Aug 01, 2018 at 10:23:27AM +0100, Quentin Perret wrote:
> > On Wednesday 01 Aug 2018 at 10:35:32 (+0200), Rafael J. Wysocki wrote:
> > > On Wed, Aug 1, 2018 at 10:23 AM, Quentin Perret <quentin.perret@xxxxxxx> wrote:
> > > > On Wednesday 01 Aug 2018 at 09:32:49 (+0200), Rafael J. Wysocki wrote:
> > > >> On Tue, Jul 31, 2018 at 9:31 PM, <skannan@xxxxxxxxxxxxxx> wrote:
> > > >> >> On Monday 30 Jul 2018 at 12:35:27 (-0700), skannan@xxxxxxxxxxxxxx wrote:
> > > >> >>> If it's going to be a different aggregation from what's done for
> > > >> >>> frequency
> > > >> >>> guidance, I don't see the point of having this inside schedutil. Why not
> > > >> >>> keep it inside the scheduler files?
> > > >> >>
> > > >> >> This code basically results from a discussion we had with Peter on v4.
> > > >> >> Keeping everything centralized can make sense from a maintenance
> > > >> >> perspective, I think. That makes it easy to see the impact of any change
> > > >> >> to utilization signals for both EAS and schedutil.
> > > >> >
> > > >> > In that case, I'd argue it makes more sense to keep the code centralized in
> > > >> > the scheduler. The scheduler can let schedutil know about the utilization
> > > >> > after it aggregates them. There's no need for a cpufreq governor to know
> > > >> > that there are scheduling classes or how many there are. And the scheduler
> > > >> > can then choose to aggregate one way for task packing and another way for
> > > >> > frequency guidance.
> > > >>
> > > >> Also the aggregate utilization may be used by cpuidle governors in
> > > >> principle to decide how deep they can go with idle state selection.
> > > >
> > > > The only issue I see with this right now is that some of the things done
> > > > in this function are policy decisions which really belong to the governor,
> > > > I think.
> > >
> > > Well, the scheduler makes policy decisions too, in quite a few places. :-)
> >
> > That is true ... ;-) But not so much about frequency selection yet I guess
>
> Well, sugov is part of the scheduler :-) It being so allows for the
> co-ordinated decision making required for EAS.
>
> > > The really important consideration here is whether or not there may be
> > > multiple governors making different policy decisions in that respect.
> > > If not, then where exactly the single policy decision is made doesn't
> > > particularly matter IMO.
> >
> > I think some users of the aggregated utilization signal do want to make
> > slightly different decisions (I'm thinking about the RT-go-to-max thing
> > again which makes perfect sense in sugov, but could possibly hurt EAS).
> >
> > So the "hard" part of this work is to figure out what really is a
> > governor-specific policy decision, and what is common between all users.
> > I put "hard" between quotes because I only see the case of RT as truly
> > sugov-specific for now.
> >
> > If we also want a special case for DL, Peter's enum should work OK, and
> > enable to add more special cases for new users (cpuidle ?) if needed.
> > But maybe that is something for later ?
>
> Right, I don't mind moving the function. What I do oppose is having two
> very similar functions in different translation units -- because then
> they _will_ diverge and result in 'funny' things.

Sounds good :-) Would kernel/sched/pelt.c be the right place then ? It's
cross-class and kinda pelt-related I guess

Thanks,
Quentin