Re: [PATCH] sched/fair: Introduce fits_capacity()

From: Quentin Perret
Date: Tue Jun 18 2019 - 05:07:46 EST


On Tuesday 18 Jun 2019 at 10:36:56 (+0200), Rafael J. Wysocki wrote:
> On Tue, Jun 18, 2019 at 10:25 AM Quentin Perret <quentin.perret@xxxxxxx> wrote:
> >
> > On Tuesday 18 Jun 2019 at 10:10:48 (+0200), Rafael J. Wysocki wrote:
> > > On Tue, Jun 18, 2019 at 9:47 AM Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
> > > >
> > > > On 18-06-19, 09:26, Rafael J. Wysocki wrote:
> > > > > On Tue, Jun 18, 2019 at 5:12 AM Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
> > > > > >
> > > > > > +Rafael
> > > > > >
> > > > > > On 17-06-19, 17:02, Peter Zijlstra wrote:
> > > > > > > On Thu, Jun 06, 2019 at 08:22:04AM +0530, Viresh Kumar wrote:
> > > > > > > > Hmm, even if the values are same currently I am not sure if we want
> > > > > > > > the same for ever. I will write a patch for it though, if Peter/Rafael
> > > > > > > > feel the same as you.
> > > > > > >
> > > > > > > Is it really the same variable or just two numbers that happen to be the
> > > > > > > same?
> > > > > >
> > > > > > In both cases we are trying to keep the load under 80% of what can be supported.
> > > > > > But I am not sure of the answer to your question.
> > > > > >
> > > > > > Maybe Rafael knows :)
> > > > >
> > > > > Which variable?
> > > >
> > > > Schedutil multiplies the target frequency by 1.25 (20% more capacity eventually)
> > > > to get enough room for more load and similar thing is done in fair.c at several
> > > > places to see if the new task can fit in a runqueue without overloading it.
> > >
> > > For the schedutil part, see the changelog of the commit that introduced it:
> > >
> > > 9bdcb44e391d cpufreq: schedutil: New governor based on scheduler
> > > utilization data
> > >
> > > As for the other places, I don't know about the exact reasoning.
> > >
> > > > Quentin suggested to use common code for this calculation and that is what is
> > > > getting discussed here.
> > >
> > > I guess if the rationale for the formula is the same in all cases, it
> > > would be good to consolidate that code and document the rationale
> > > while at it.
> >
> > I _think_ it is, but I guess others could correct me if this is
> > incorrect.
> >
> > When choosing a CPU or a frequency using a util value, we look for a
> > capacity that will provide us with 20% of idle time. And in both case we
> > use the same threshold, just hardcoded in different places. Hence the
> > suggestion to unify things.
> >
> > I hope that makes sense :-)
>
> Well, for schedutil, the 1.25 value comes from the case when
> utilization is not frequency-invariant the next-frequency formula is
> recursive (the next frequency is proportional to the current one). It
> is chosen to get the new frequency equal to the old one if (util /
> max) is .8. That translates to the "capacity that will provide 20%
> more of idle time" in the frequency-invariant utilization case, but
> the original rationale was different.

OK, thanks, I wasn't aware of this. I understood it the other way
around, but re-reading the commit message you shared earlier this makes
sense.

I guess it is also worth mentioning that, to the best of my knowledge,
the vast majority of real-world sugov users are in fact frequency
invariant. So perhaps there is still a case for the code factorization
suggested earlier. But in the end it is really just a cleanup to help
maintainability, so if you guys don't buy in there is no point pushing
further :-)

Thanks,
Quentin