Re: [PATCH v3 06/14] sched/cpufreq: uclamp: add utilization clamping for RT tasks

From: Vincent Guittot
Date: Mon Aug 13 2018 - 08:07:23 EST


On Mon, 13 Aug 2018 at 12:12, Patrick Bellasi <patrick.bellasi@xxxxxxx> wrote:
>
> Hi Vincent!
>
> On 09-Aug 18:03, Vincent Guittot wrote:
> > > On 07-Aug 15:26, Juri Lelli wrote:
>
> [...]
>
> > > > > + util_cfs = cpu_util_cfs(rq);
> > > > > + util_rt = cpu_util_rt(rq);
> > > > > + if (sched_feat(UCLAMP_SCHED_CLASS)) {
> > > > > + util = 0;
> > > > > + if (util_cfs)
> > > > > + util += uclamp_util(cpu_of(rq), util_cfs);
> > > > > + if (util_rt)
> > > > > + util += uclamp_util(cpu_of(rq), util_rt);
> > > > > + } else {
> > > > > + util = cpu_util_cfs(rq);
> > > > > + util += cpu_util_rt(rq);
> > > > > + util = uclamp_util(cpu_of(rq), util);
> > > > > + }
> > >
> > > Regarding the two policies, do you have any comment?
> >
> > Does the policy for (sched_feat(UCLAMP_SCHED_CLASS)== true) really
> > make sense as it is ?
> > I mean, uclamp_util doesn't make any difference between rt and cfs
> > tasks when clamping the utilization so why should be add twice the
> > returned value ?
> > IMHO, this policy would make sense if there were something like
> > uclamp_util_rt() and a uclamp_util_cfs()
>
> The idea for the UCLAMP_SCHED_CLASS policy is to improve fairness on
> low-priority classese, especially when we have high RT utilization.
>
> Let say we have:
>
> util_rt = 40%, util_min=0%
> util_cfs = 10%, util_min=50%
>
> the two policies will select:
>
> UCLAMP_SCHED_CLASS: util = uclamp(40) + uclamp(10) = 50 + 50 = 100%
> !UCLAMP_SCHED_CLASS: util = uclamp(40 + 10) = uclmp(50) = 50%
>
> Which means that, despite the CPU's util_min will be set to 50% when
> CFS is running, these tasks will have almost no boost at all, since
> their bandwidth margin is eclipsed by RT tasks.

Hmm ... At the opposite, even if there is no running rt task but only
some remaining blocked rt utilization,
even if util_rt = 10%, util_min=0%
and util_cfs = 40%, util_min=50%
the UCLAMP_SCHED_CLASS: util = uclamp(10) + uclamp(40) = 50 + 50 = 100%

So cfs task can get double boosted by a small rt task.

Furthermore, if there is no rt task but 2 cfs tasks of 40% and 10%
the UCLAMP_SCHED_CLASS: util = uclamp(0) + uclamp(40) = 50 = 50%

So in this case cfs tasks don't get more boost and have to share the
bandwidth and you don't ensure 50% for each unlike what you try to do
for rt.
You create a difference in the behavior depending of the class of the
others co-schedule tasks which is not sane IMHO


>
> > > We had an internal discussion and we found pro/cons for both... but
>
> The UCLAMP_SCHED_CLASS policy is thus less energy efficiency but it
> should grant a better "isolation" in terms of what is the expected
> speed-up a task will get at run-time, independently from higher
> priority classes.
>
> Does that make sense?
>
> > > I'm not sure keeping the sched_feat is a good solution on the long
> > > run, i.e. mainline merge ;)
>
> This problem still stands...
>
> --
> #include <best/regards.h>
>
> Patrick Bellasi