Re: [PATCH V4 4/4] thermal/drivers/cpu_cooling: Rename to cpufreq_cooling

From: Daniel Lezcano
Date: Tue Dec 10 2019 - 04:15:45 EST


On 10/12/2019 09:57, Martin Kepplinger wrote:
>
>
> On 09.12.19 20:29, Daniel Lezcano wrote:
>> On 09/12/2019 13:03, Daniel Lezcano wrote:
>>> On 09/12/2019 10:54, Martin Kepplinger wrote:
>>>>
>>>>
>>>> On 06.12.19 15:15, Daniel Lezcano wrote:
>>>>> On 06/12/2019 12:33, Martin Kepplinger wrote:
>>>>>> I tested this on the librem5-devkit and see the
>>>>>> cooling devices in sysfs. I configure ARM_PSCI_CPUIDLE, not ARM_CPUIDLE and
>>>>>> add the patch below in register the cooling device there. "psci_idle"
>>>>>> is listed as the cpuidle_driver.
>>>>>>
>>>>>> That's what I'm running, in case you want to see it all:
>>>>>> https://source.puri.sm/martin.kepplinger/linux-next/commits/next-20191205/librem5_cpuidle_mainline_atf
>>>>>>
>>>>>> so I add a trip temperature description like this:
>>>>>> https://source.puri.sm/martin.kepplinger/linux-next/commit/361f49f93ae2c477fd012790831cabd0ed976660
>>>>>>
>>>>>> When I let the SoC heat up, cpuidle cooling won't kick it. In sysfs:
>>>>>>
>>>>>> catting the relevant files in /sys/class/thermal after heating up,
>>>>>> if that makes sense:
>>>>>>
>>>>>> 87000
>>>>>> 85000
>>>>>> 85000
>>>>>> thermal-cpufreq-0
>>>>>> 1
>>>>>> thermal-idle-0
>>>>>> 0
>>>>>> thermal-idle-1
>>>>>> 0
>>>>>> thermal-idle-2
>>>>>> 0
>>>>>> thermal-idle-3
>>>>>> 0
>>>>>>
>>>>>> with ARM_CPUIDLE instead of ARM_PSCI_CPUIDLE (and registering the cooling dev
>>>>>> during cpuidle-arm.c init) I won't have a cpuidle driver and thus no cpu-sleep
>>>>>> state at all.
>>>>>>
>>>>>> Can you see where the problem here lies?
>>>>>
>>>>> Yes, I removed the registration via the DT.
>>>>>
>>>>> Can you try the following:
>>>>>
>>>>> diff --git a/drivers/cpuidle/dt_idle_states.c
>>>>> b/drivers/cpuidle/dt_idle_states.c
>>>>> index d06d21a9525d..01367ddec49a 100644
>>>>> --- a/drivers/cpuidle/dt_idle_states.c
>>>>> +++ b/drivers/cpuidle/dt_idle_states.c
>>>>> @@ -13,6 +13,7 @@
>>>>> #include <linux/errno.h>
>>>>> #include <linux/kernel.h>
>>>>> #include <linux/module.h>
>>>>> +#include <linux/cpu_cooling.h>
>>>>> #include <linux/of.h>
>>>>> #include <linux/of_device.h>
>>>>>
>>>>> @@ -205,6 +206,9 @@ int dt_init_idle_driver(struct cpuidle_driver *drv,
>>>>> err = -EINVAL;
>>>>> break;
>>>>> }
>>>>> +
>>>>> + cpuidle_of_cooling_register(state_node, drv);
>>>>> +
>>>>> of_node_put(state_node);
>>>>> }
>>>>>
>>>>> That's a hack for the moment.
>>>>>
>>>>
>>>> thanks. I could test that successfully. The only question would be: Is
>>>> is intentional how "non-aggressive" the cooling driver cools? I would
>>>> have expected it to basically inject more idle cycles earlier. I'd set
>>>> 75 degrees as trip point and at 85 degress is would only inject about 30
>>>> (of 100).
>>
>> By the way, how many CPUs are injecting idle cycle when the mitigation
>> happens ?
>
> all 4 are injecting the same.
>
>>
>>>> You describe the "config values" in question in the documentation, but
>>>> I'm not sure what's the correct way to change them.
>>>
>>> That is difficult to say without knowing the board behavior. Are you
>>> able to profile the temperature with the load? How fast the temperature
>>> increases? The aggressive behavior of the cooling device will depend on
>>> the governor which depends on the slope of the temperature increase and
>>> the sampling.
>>>
>>> Can you give the pointer to the git tree with the DT definition of your
>>> board?
>
> https://source.puri.sm/martin.kepplinger/linux-next/blob/next-20191205/librem5_cpuidle_mainline_atf/arch/arm64/boot/dts/freescale/imx8mq-librem5-devkit.dts
>
> you can browse in that branch.
>
>>>
>>> You can try by changing the idle duration to 10ms instead of the default
>>> 4ms.
>
> where is that set?

diff --git a/drivers/thermal/cpuidle_cooling.c
b/drivers/thermal/cpuidle_cooling.c
index 369c5c613f6b..0793e722b2d2 100644
--- a/drivers/thermal/cpuidle_cooling.c
+++ b/drivers/thermal/cpuidle_cooling.c
@@ -192,7 +192,7 @@ __init cpuidle_of_cooling_register(struct
device_node *np,
goto out_id;
}

- idle_inject_set_duration(ii_dev, 0, TICK_USEC);
+ idle_inject_set_duration(ii_dev, 0, 10000);

idle_cdev->ii_dev = ii_dev;


>>>
>>> You can also change the cooling states in the DT <&state 20 70>, so it
>>> will begin to mitigate at state 20. But I wouldn't recommend that.
>
> where would we assign that? I'm not sure who reads that -.-> it's still something to consider, but a longer idle duration makes more
> sense, yes.
>
>>>
>>> Do you have the energy power model, so we can try with the IPA governor?
>>>
>>>
>
> thanks for the reminder. I'd look at that later.
>
> martin
>


--
<http://www.linaro.org/> Linaro.org â Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog