[PATCH v2 5/5] powercap/drivers/dtpm: Scale the power with the load

From: Daniel Lezcano
Date: Tue Mar 09 2021 - 17:43:33 EST


Currently the power consumption is based on the current OPP power
assuming the entire performance domain is fully loaded.

That gives very gross power estimation and we can do much better by
using the load to scale the power consumption.

Use the utilization to normalize and scale the power usage over the
max possible power.

Tested on a rock960 with 2 big CPUS, the power consumption estimation
conforms with the expected one.

Before this change:

~$ ~/dhrystone -t 1 -l 10000&
~$ cat /sys/devices/virtual/powercap/dtpm/dtpm:0/dtpm:0:1/constraint_0_max_power_uw
2260000

After this change:

~$ ~/dhrystone -t 1 -l 10000&
~$ cat /sys/devices/virtual/powercap/dtpm/dtpm:0/dtpm:0:1/constraint_0_max_power_uw
1130000

~$ ~/dhrystone -t 2 -l 10000&
~$ cat /sys/devices/virtual/powercap/dtpm/dtpm:0/dtpm:0:1/constraint_0_max_power_uw
2260000

Signed-off-by: Daniel Lezcano <daniel.lezcano@xxxxxxxxxx>
---

V2:
- Replaced cpumask by em_span_cpus
- Changed 'util' metrics variable types
- Optimized utilization scaling power computation
- Renamed parameter name for scale_pd_power_uw()
---
drivers/powercap/dtpm_cpu.c | 27 ++++++++++++++++++++-------
1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/drivers/powercap/dtpm_cpu.c b/drivers/powercap/dtpm_cpu.c
index ac7f2e7e262f..6a1537e6da0d 100644
--- a/drivers/powercap/dtpm_cpu.c
+++ b/drivers/powercap/dtpm_cpu.c
@@ -68,27 +68,40 @@ static u64 set_pd_power_limit(struct dtpm *dtpm, u64 power_limit)
return power_limit;
}

+static u64 scale_pd_power_uw(struct cpumask *pd_mask, u64 power)
+{
+ unsigned long max, sum_util = 0;
+ int cpu;
+
+ max = arch_scale_cpu_capacity(cpu);
+
+ for_each_cpu_and(cpu, pd_mask, cpu_online_mask)
+ sum_util += sched_cpu_util(cpu, max);
+
+ return (power * ((sum_util << 10) / max)) >> 10;
+}
+
static u64 get_pd_power_uw(struct dtpm *dtpm)
{
struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm);
struct em_perf_domain *pd;
- struct cpumask cpus;
+ struct cpumask *pd_mask;
unsigned long freq;
- int i, nr_cpus;
+ int i;

pd = em_cpu_get(dtpm_cpu->cpu);
- freq = cpufreq_quick_get(dtpm_cpu->cpu);

- cpumask_and(&cpus, cpu_online_mask, to_cpumask(pd->cpus));
- nr_cpus = cpumask_weight(&cpus);
+ pd_mask = em_span_cpus(pd);
+
+ freq = cpufreq_quick_get(dtpm_cpu->cpu);

for (i = 0; i < pd->nr_perf_states; i++) {

if (pd->table[i].frequency < freq)
continue;

- return pd->table[i].power *
- MICROWATT_PER_MILLIWATT * nr_cpus;
+ return scale_pd_power_uw(pd_mask, pd->table[i].power *
+ MICROWATT_PER_MILLIWATT);
}

return 0;
--
2.17.1