Re: cpufreq: intel_pstate: map utilization into the pstate range

From: Julia Lawall
Date: Fri Dec 24 2021 - 06:08:58 EST




On Wed, 22 Dec 2021, Rafael J. Wysocki wrote:

> On Wed, Dec 22, 2021 at 12:57 AM Francisco Jerez <currojerez@xxxxxxxxxx> wrote:
> >
> > "Rafael J. Wysocki" <rafael@xxxxxxxxxx> writes:
> >
> > > On Sun, Dec 19, 2021 at 11:10 PM Francisco Jerez <currojerez@xxxxxxxxxx> wrote:
> > >>
> > >> Julia Lawall <julia.lawall@xxxxxxxx> writes:
> > >>
> > >> > On Sat, 18 Dec 2021, Francisco Jerez wrote:
> > >
> > > [cut]
> > >
> > >> > I did some experiements with forcing different frequencies. I haven't
> > >> > finished processing the results, but I notice that as the frequency goes
> > >> > up, the utilization (specifically the value of
> > >> > map_util_perf(sg_cpu->util) at the point of the call to
> > >> > cpufreq_driver_adjust_perf in sugov_update_single_perf) goes up as well.
> > >> > Is this expected?
> > >> >
> > >>
> > >> Actually, it *is* expected based on our previous hypothesis that these
> > >> workloads are largely latency-bound: In cases where a given burst of CPU
> > >> work is not parallelizable with any other tasks the thread needs to
> > >> complete subsequently, its overall runtime will decrease monotonically
> > >> with increasing frequency, therefore the number of instructions executed
> > >> per unit of time will increase monotonically with increasing frequency,
> > >> and with it its frequency-invariant utilization.
> > >
> > > But shouldn't these two effects cancel each other if the
> > > frequency-invariance mechanism works well?
> >
> > No, they won't cancel each other out under our hypothesis that these
> > workloads are largely latency-bound, since the performance of the
> > application will increase steadily with increasing frequency, and with
> > it the amount of computational resources it utilizes per unit of time on
> > the average, and therefore its frequency-invariant utilization as well.
>
> OK, so this is a workload in which the maximum performance is only
> achieved at the maximum available frequency. IOW, there's no
> performance saturation point and increasing the frequency (if
> possible) will always cause more work to be done per unit of time.
>
> For this type of workloads, requirements regarding performance (for
> example, upper bound on the expected time of computations) need to be
> known in order to determine the "most suitable" frequency to run them
> and I agree that schedutil doesn't help much in that respect.
>
> It is probably better to run them with intel_pstate in the active mode
> (ie. "pure HWP") or decrease EPP via sysfs to allow HWP to ramp up
> turbo more aggressively.

active mode + powersave indeed both gives faster runtimes and less energy
consumption for these examples.

thanks,
julia