Re: [PATCH v6 4/5] cpufreq: qcom: Update the bandwidth levels on frequency change

From: Sibi Sankar
Date: Tue Jun 16 2020 - 17:05:06 EST


Hey Matthias,
Thanks for taking time to review
the series.

On 2020-06-15 22:55, Matthias Kaehlcke wrote:
Hi Sibi,

On Sat, Jun 06, 2020 at 03:03:31AM +0530, Sibi Sankar wrote:
Add support to parse optional OPP table attached to the cpu node when
the OPP bandwidth values are populated. This allows for scaling of
DDR/L3 bandwidth levels with frequency change.

Signed-off-by: Sibi Sankar <sibis@xxxxxxxxxxxxxx>
---

v6:
* Add global flag to distinguish between voltage update and opp add.
Use the same flag before trying to scale ddr/l3 bw [Viresh]
* Use dev_pm_opp_find_freq_ceil to grab all opps [Viresh]
* Move dev_pm_opp_of_find_icc_paths into probe [Viresh]

v5:
* Use dev_pm_opp_adjust_voltage instead [Viresh]
* Misc cleanup

v4:
* Split fast switch disable into another patch [Lukasz]

drivers/cpufreq/qcom-cpufreq-hw.c | 82 ++++++++++++++++++++++++++++++-
1 file changed, 80 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
index fc92a8842e252..8fa6ab6e0e4b6 100644
--- a/drivers/cpufreq/qcom-cpufreq-hw.c
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c
@@ -6,6 +6,7 @@
#include <linux/bitfield.h>
#include <linux/cpufreq.h>
#include <linux/init.h>
+#include <linux/interconnect.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/of_address.h>
@@ -30,6 +31,48 @@

static unsigned long cpu_hw_rate, xo_rate;
static struct platform_device *global_pdev;
+static bool icc_scaling_enabled;

It seem you rely on 'icc_scaling_enabled' to be initialized to 'false'.
This works during the first initialization, but not if the 'device' is
unbound/rebound. In theory things shouldn't be different in a succesive

yes it shouldn't but sure I'll set
it to false along the way.

initialization, however for robustness the variable should be explicitly
set to 'false' somewhere in the code path (_probe(), _read_lut(), ...).


+static int qcom_cpufreq_set_bw(struct cpufreq_policy *policy,
+ unsigned long freq_khz)
+{
+ unsigned long freq_hz = freq_khz * 1000;
+ struct dev_pm_opp *opp;
+ struct device *dev;
+ int ret;
+
+ dev = get_cpu_device(policy->cpu);
+ if (!dev)
+ return -ENODEV;
+
+ opp = dev_pm_opp_find_freq_exact(dev, freq_hz, true);
+ if (IS_ERR(opp))
+ return PTR_ERR(opp);
+
+ ret = dev_pm_opp_set_bw(dev, opp);
+ dev_pm_opp_put(opp);
+ return ret;
+}
+
+static int qcom_cpufreq_update_opp(struct device *cpu_dev,
+ unsigned long freq_khz,
+ unsigned long volt)
+{
+ unsigned long freq_hz = freq_khz * 1000;
+ int ret;
+
+ /* Skip voltage update if the opp table is not available */
+ if (!icc_scaling_enabled)
+ return dev_pm_opp_add(cpu_dev, freq_hz, volt);
+
+ ret = dev_pm_opp_adjust_voltage(cpu_dev, freq_hz, volt, volt, volt);
+ if (ret) {
+ dev_err(cpu_dev, "Voltage update failed freq=%ld\n", freq_khz);
+ return ret;
+ }
+
+ return dev_pm_opp_enable(cpu_dev, freq_hz);
+}

static int qcom_cpufreq_hw_target_index(struct cpufreq_policy *policy,
unsigned int index)
@@ -39,6 +82,9 @@ static int qcom_cpufreq_hw_target_index(struct cpufreq_policy *policy,

writel_relaxed(index, perf_state_reg);

+ if (icc_scaling_enabled)
+ qcom_cpufreq_set_bw(policy, freq);
+
arch_set_freq_scale(policy->related_cpus, freq,
policy->cpuinfo.max_freq);
return 0;
@@ -89,11 +135,31 @@ static int qcom_cpufreq_hw_read_lut(struct device *cpu_dev,
u32 data, src, lval, i, core_count, prev_freq = 0, freq;
u32 volt;
struct cpufreq_frequency_table *table;
+ struct dev_pm_opp *opp;
+ unsigned long rate;
+ int ret;

table = kcalloc(LUT_MAX_ENTRIES + 1, sizeof(*table), GFP_KERNEL);
if (!table)
return -ENOMEM;

+ ret = dev_pm_opp_of_add_table(cpu_dev);
+ if (!ret) {
+ /* Disable all opps and cross-validate against LUT */

nit: IIUC the cross-validation doesn't happen in this branch, so the
comment is a bit misleading. Maybe change it to "Disable all opps to
cross-validate against the LUT {below,later}".

sure will re-word it.


+ icc_scaling_enabled = true;
+ for (rate = 0; ; rate++) {
+ opp = dev_pm_opp_find_freq_ceil(cpu_dev, &rate);
+ if (IS_ERR(opp))
+ break;
+
+ dev_pm_opp_put(opp);
+ dev_pm_opp_disable(cpu_dev, rate);
+ }
+ } else if (ret != -ENODEV) {
+ dev_err(cpu_dev, "Invalid opp table in device tree\n");
+ return ret;
+ }
+
for (i = 0; i < LUT_MAX_ENTRIES; i++) {
data = readl_relaxed(base + REG_FREQ_LUT +
i * LUT_ROW_SIZE);
@@ -112,7 +178,7 @@ static int qcom_cpufreq_hw_read_lut(struct device *cpu_dev,

if (freq != prev_freq && core_count != LUT_TURBO_IND) {
table[i].frequency = freq;
- dev_pm_opp_add(cpu_dev, freq * 1000, volt);
+ qcom_cpufreq_update_opp(cpu_dev, freq, volt);

This is the cross-validation mentioned above, right? Shouldn't it include
a check of the return value?

Yes, this is the cross-validation step,
we adjust the voltage if opp-tables are
present/added successfully and enable
them, else we would just do a add opp.
We don't want to exit early on a single
opp failure. We will error out a bit
later if the opp-count ends up to be
zero.


dev_dbg(cpu_dev, "index=%d freq=%d, core_count %d\n", i,
freq, core_count);
} else if (core_count == LUT_TURBO_IND) {
@@ -133,7 +199,8 @@ static int qcom_cpufreq_hw_read_lut(struct device *cpu_dev,
if (prev->frequency == CPUFREQ_ENTRY_INVALID) {
prev->frequency = prev_freq;
prev->flags = CPUFREQ_BOOST_FREQ;
- dev_pm_opp_add(cpu_dev, prev_freq * 1000, volt);
+ qcom_cpufreq_update_opp(cpu_dev, prev_freq,
+ volt);

ditto

nit: with the updated max line length it isn't necessary anymore to break
this into multiple lines
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/checkpatch.pl?h=v5.8-rc1#n54),
though the coding style still has the old limit.

yeah I'll expand it.

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project.