Re: [intel-pstate driver regression] processor frequency very high even if in idle

From: Rafael J. Wysocki
Date: Thu Mar 31 2016 - 07:39:56 EST


On Thursday, March 31, 2016 11:05:56 AM JÃrg Otte wrote:

[cut]

> >
>
> Yes, works for me.
>
> CPUID(7): No-SGX
> CPU Avg_MHz Busy% Bzy_MHz TSC_MHz
> - 11 0.66 1682 2494
> 0 11 0.60 1856 2494
> 1 6 0.34 1898 2494
> 2 13 0.82 1628 2494
> 3 13 0.87 1528 2494
> CPU Avg_MHz Busy% Bzy_MHz TSC_MHz
> - 6 0.58 963 2494
> 0 8 0.83 957 2494
> 1 1 0.08 984 2494
> 2 10 1.04 975 2494
> 3 3 0.35 934 2494
>

Great, thanks!

To me, the only area where things are really different before and after the
revert is the initialization, so that likely is when the problem triggers.

And sure enough, there is an initialization problem in intel_pstate.

Please test the patch below instead of the revert and let me know if it
makes any difference.

It (or equivalent) will need to be applied anyway, so we'll work on top of it
going forward, but also it may just be sufficient to address the problem you're
seeing.

---
From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
Subject: [PATCH] intel_pstate: Do not set utilization update hook too early

The utilization update hook in the intel_pstate driver is set too
early, as it only should be set after the policy has been fully
initialized by the core. That may cause intel_pstate_update_util()
to use incorrect data and put the CPUs into incorrect P-states as
a result.

To prevent that from happening, make intel_pstate_set_policy() set
the utilization update hook instead of intel_pstate_init_cpu() so
intel_pstate_update_util() only runs when all things have been
initialized as appropriate.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
---
drivers/cpufreq/intel_pstate.c | 27 +++++++++++++++++++--------
1 file changed, 19 insertions(+), 8 deletions(-)

Index: linux-pm/drivers/cpufreq/intel_pstate.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/intel_pstate.c
+++ linux-pm/drivers/cpufreq/intel_pstate.c
@@ -1103,7 +1103,6 @@ static int intel_pstate_init_cpu(unsigne
intel_pstate_sample(cpu, 0);

cpu->update_util.func = intel_pstate_update_util;
- cpufreq_set_update_util_data(cpunum, &cpu->update_util);

pr_debug("intel_pstate: controlling: cpu %d\n", cpunum);

@@ -1122,18 +1121,29 @@ static unsigned int intel_pstate_get(uns
return get_avg_frequency(cpu);
}

+static void intel_pstate_set_update_util_hook(unsigned int cpu)
+{
+ cpufreq_set_update_util_data(cpu, &all_cpu_data[cpu]->update_util);
+}
+
+static void intel_pstate_clear_update_util_hook(unsigned int cpu)
+{
+ cpufreq_set_update_util_data(cpu, NULL);
+ synchronize_sched();
+}
+
static int intel_pstate_set_policy(struct cpufreq_policy *policy)
{
if (!policy->cpuinfo.max_freq)
return -ENODEV;

+ intel_pstate_clear_update_util_hook(policy->cpu);
+
if (policy->policy == CPUFREQ_POLICY_PERFORMANCE &&
policy->max >= policy->cpuinfo.max_freq) {
pr_debug("intel_pstate: set performance\n");
limits = &performance_limits;
- if (hwp_active)
- intel_pstate_hwp_set(policy->cpus);
- return 0;
+ goto out;
}

pr_debug("intel_pstate: set powersave\n");
@@ -1163,6 +1173,9 @@ static int intel_pstate_set_policy(struc
limits->max_perf = div_fp(int_tofp(limits->max_perf_pct),
int_tofp(100));

+ out:
+ intel_pstate_set_update_util_hook(policy->cpu);
+
if (hwp_active)
intel_pstate_hwp_set(policy->cpus);

@@ -1187,8 +1200,7 @@ static void intel_pstate_stop_cpu(struct

pr_debug("intel_pstate: CPU %d exiting\n", cpu_num);

- cpufreq_set_update_util_data(cpu_num, NULL);
- synchronize_sched();
+ intel_pstate_set_update_util_hook(cpu_num);

if (hwp_active)
return;
@@ -1455,8 +1467,7 @@ out:
get_online_cpus();
for_each_online_cpu(cpu) {
if (all_cpu_data[cpu]) {
- cpufreq_set_update_util_data(cpu, NULL);
- synchronize_sched();
+ intel_pstate_clear_update_util_hook(cpu);
kfree(all_cpu_data[cpu]);
}
}