Re: [RFC/RFT] [PATCH 02/10] cpufreq: intel_pstate: Conditional frequency invariant accounting

From: Juri Lelli
Date: Wed May 16 2018 - 10:46:53 EST


On 15/05/18 21:49, Srinivas Pandruvada wrote:
> intel_pstate has two operating modes: active and passive. In "active"
> mode, the in-built scaling governor is used and in "passive" mode,
> the driver can be used with any governor like "schedutil". In "active"
> mode the utilization values from schedutil is not used and there is
> a requirement from high performance computing use cases, not to read
> any APERF/MPERF MSRs. In this case no need to use CPU cycles for
> frequency invariant accounting by reading APERF/MPERF MSRs.
> With this change frequency invariant account is only enabled in
> "passive" mode.
>
> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@xxxxxxxxxxxxxxx>
> ---
> [Note: The tick will be enabled later in the series when hwp dynamic
> boost is enabled]
>
> drivers/cpufreq/intel_pstate.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index 17e566af..f686bbe 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -2040,6 +2040,8 @@ static int intel_pstate_register_driver(struct cpufreq_driver *driver)
> {
> int ret;
>
> + x86_arch_scale_freq_tick_disable();
> +
> memset(&global, 0, sizeof(global));
> global.max_perf_pct = 100;
>
> @@ -2052,6 +2054,9 @@ static int intel_pstate_register_driver(struct cpufreq_driver *driver)
>
> global.min_perf_pct = min_perf_pct_min();
>
> + if (driver == &intel_cpufreq)
> + x86_arch_scale_freq_tick_enable();

This will unconditionally trigger the reading/calculation at each tick
even though information is not actually consumed (e.g., running
performance or any other governor), right? Do we want that?

Anyway, FWIW I started testing this on a E5-2609 v3 and I'm not seeing
hackbench regressions so far (running with schedutil governor).

Best,

- Juri