Re: [PATCH v3] cpufreq / CPPC: Add cpuinfo_cur_freq support for CPPC

From: Prakash, Prashanth
Date: Tue Jul 10 2018 - 11:49:37 EST




On 7/9/2018 11:42 PM, George Cherian wrote:
> Hi Prakash,
>
>
> On 07/09/2018 10:12 PM, Prakash, Prashanth wrote:
>>
>> Hi George,
>>
>>
>> On 7/9/2018 4:10 AM, George Cherian wrote:
>>> Per Section 8.4.7.1.3 of ACPI 6.2, The platform provides performance
>>> feedback via set of performance counters. To determine the actual
>>> performance level delivered over time, OSPM may read a set of
>>> performance counters from the Reference Performance Counter Register
>>> and the Delivered Performance Counter Register.
>>>
>>> OSPM calculates the delivered performance over a given time period by
>>> taking a beginning and ending snapshot of both the reference and
>>> delivered performance counters, and calculating:
>>>
>>> delivered_perf = reference_perf X (delta of delivered_perf counter / delta of reference_perf counter).
>>>
>>> Implement the above and hook this to the cpufreq->get method.
>>>
>>> Signed-off-by: George Cherian <george.cherian@xxxxxxxxxx>
>>> Acked-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx>
>>> ---
>>> Â drivers/cpufreq/cppc_cpufreq.c | 44 ++++++++++++++++++++++++++++++++++++++++++
>>> Â 1 file changed, 44 insertions(+)
>>>
>>> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
>>> index a9d3eec..61132e8 100644
>>> --- a/drivers/cpufreq/cppc_cpufreq.c
>>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>>> @@ -296,10 +296,54 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>>> ÂÂÂÂÂÂ return ret;
>>> Â }
>>>
>>> +static int cppc_get_rate_from_fbctrs(struct cppc_cpudata *cpu,
>>> +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ struct cppc_perf_fb_ctrs fb_ctrs_t0,
>>> +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ struct cppc_perf_fb_ctrs fb_ctrs_t1)
>>> +{
>>> +ÂÂÂÂ u64 delta_reference, delta_delivered;
>>> +ÂÂÂÂ u64 reference_perf, delivered_perf;
>>> +
>>> +ÂÂÂÂ reference_perf = fb_ctrs_t0.reference_perf;
>>> +
>>> +ÂÂÂÂ delta_reference = (u32)fb_ctrs_t1.reference -
>>> +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ (u32)fb_ctrs_t0.reference;
>>> +ÂÂÂÂ delta_delivered = (u32)fb_ctrs_t1.delivered -
>>> +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ (u32)fb_ctrs_t0.delivered;
>> Why (u32)? These registers can be 64bits and that's why cppc_perf_fb_ctrs
>> have 64b fields for reference and delivered counters.
>>
>> Moreover, the integer math is incorrect. You can run into a scenario where
>> t1.ref/del < t0.ref/del, thus setting a negative number to u64! The likelihood
>> of this is very high especially when you throw away the higher 32bits.
>>
> Because of binary representation, unsigned subtraction will work even if
> t1.ref/del < t0.ref/del. So essentially, the code should look like
> this,
>
> static inline u64 get_delta(u64 t1, u64 t0)
> {
> ÂÂÂÂÂÂÂ if (t1 > t0 || t0 > ~(u32)0)
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ return t1 - t0;
>
> ÂÂÂÂÂÂÂ return (u32)t1 - (u32)t0;
> }
>
> As a further optimization, I used (u32) since that also works,
> as long as the momentary delta at any point is not greater than 2 ^ 32.
> I don't foresee any reason for any platform to increment the counters at
> an interval greater than 2 ^ 32.

We are NOT running within any critical section to make sure that there will be
no context switch between feedback counter reads. Thus the assumptions that
the delta always represent a very short momentary window of time and that
it is always less than 2^32 is not accurate.

The single overflow assumption about when the above interger math will
work is also not acceptable - especially when we throw away the higher order bits.
There are hardware out there that uses 64b counters and can overflow lower 32b
in quite short order of time. Since the spec (and some hardware) provides 64bits,
we should use it make our implementation more robust instead of throwing away
Âthe higher order bits.

I think it's ok to use the above integer math, but please add a comment about
single overflow assumption and don't throw away the higher 32bits.

>
>> To keep things simple, do something like below:
>>
>> if (t1.reference <= t0.reference || t1.delivered <= t0.delivered) {
>> ÂÂÂÂÂ /* Atleast one of them should have overflowed */
>> ÂÂÂÂÂ return desired_perf;
>> }
>> else {
>> ÂÂÂÂÂ compute the delivered perf using the counters.
>> }
>
> No need to do like this as this is tested and found working across counter overruns in our platform.
>>
>>> +
>>> +ÂÂÂÂ /* Check to avoid divide-by zero */
>>> +ÂÂÂÂ if (delta_reference || delta_delivered)
>>> +ÂÂÂÂÂÂÂÂÂÂÂÂ delivered_perf = (reference_perf * delta_delivered) /
>>> +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ delta_reference;
>>> +ÂÂÂÂ else
>>> +ÂÂÂÂÂÂÂÂÂÂÂÂ delivered_perf = cpu->perf_ctrls.desired_perf;
>>> +
>>> +ÂÂÂÂ return cppc_cpufreq_perf_to_khz(cpu, delivered_perf);
>>> +}
>>> +
>>> +static unsigned int cppc_cpufreq_get_rate(unsigned int cpunum)
>>> +{
>>> +ÂÂÂÂ struct cppc_perf_fb_ctrs fb_ctrs_t0 = {0}, fb_ctrs_t1 = {0};
>>> +ÂÂÂÂ struct cppc_cpudata *cpu = all_cpu_data[cpunum];
>>> +ÂÂÂÂ int ret;
>>> +
>>> +ÂÂÂÂ ret = cppc_get_perf_ctrs(cpunum, &fb_ctrs_t0);
>>> +ÂÂÂÂ if (ret)
>>> +ÂÂÂÂÂÂÂÂÂÂÂÂ return ret;
>>> +
>>> +ÂÂÂÂ udelay(2); /* 2usec delay between sampling */
>>> +
>>> +ÂÂÂÂ ret = cppc_get_perf_ctrs(cpunum, &fb_ctrs_t1);
>>> +ÂÂÂÂ if (ret)
>>> +ÂÂÂÂÂÂÂÂÂÂÂÂ return ret;
>>> +
>>> +ÂÂÂÂ return cppc_get_rate_from_fbctrs(cpu, fb_ctrs_t0, fb_ctrs_t1);
>>> +}
>>> +
>>> Â static struct cpufreq_driver cppc_cpufreq_driver = {
>>> ÂÂÂÂÂÂ .flags = CPUFREQ_CONST_LOOPS,
>>> ÂÂÂÂÂÂ .verify = cppc_verify_policy,
>>> ÂÂÂÂÂÂ .target = cppc_cpufreq_set_target,
>>> +ÂÂÂÂ .get = cppc_cpufreq_get_rate,
>>> ÂÂÂÂÂÂ .init = cppc_cpufreq_cpu_init,
>>> ÂÂÂÂÂÂ .stop_cpu = cppc_cpufreq_stop_cpu,
>>> ÂÂÂÂÂÂ .name = "cppc_cpufreq",
>>