Re: [PATCH v2] thermal/core: Clear all mitigation when thermal zone is disabled

From: Manaf Meethalavalappu Pallikunhi
Date: Thu Dec 30 2021 - 13:16:26 EST



On 12/30/2021 9:39 PM, Rafael J. Wysocki wrote:
On Wed, Dec 29, 2021 at 8:03 AM Manaf Meethalavalappu Pallikunhi
<quic_manafm@xxxxxxxxxxx> wrote:
Whenever a thermal zone is in trip violated state, there is a chance
that the same thermal zone mode can be disabled either via thermal
core API or via thermal zone sysfs. Once it is disabled, the framework
bails out any re-evaluation of thermal zone. It leads to a case where
if it is already in mitigation state, it will stay the same state
until it is re-enabled.
You seem to be arguing that disabling a thermal zone should prevent it
from throttling anything, which is reasonable, but I'm not sure if the
change below is sufficient for that.

To avoid above mentioned issue, on thermal zone disable request
reset thermal zone and clear mitigation for each trip explicitly.

Signed-off-by: Manaf Meethalavalappu Pallikunhi <quic_manafm@xxxxxxxxxxx>
---
drivers/thermal/thermal_core.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 51374f4..5f4e35b 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -427,6 +427,7 @@ static int thermal_zone_device_set_mode(struct thermal_zone_device *tz,
enum thermal_device_mode mode)
{
int ret = 0;
+ int trip;
This can be declared in the block in which it is used.
Sure, will update in next patch version

mutex_lock(&tz->lock);

@@ -449,8 +450,14 @@ static int thermal_zone_device_set_mode(struct thermal_zone_device *tz,

if (mode == THERMAL_DEVICE_ENABLED)
The coding style asks for braces here if they are used after the else.
Sure will update in next version
thermal_notify_tz_enable(tz->id);
- else
+ else {
+ /* make sure all previous throttlings are cleared */
+ thermal_zone_device_init(tz);
+ for (trip = 0; trip < tz->trips; trip++)
+ handle_thermal_trip(tz, trip);
So I'm not sure if this makes the throttling go away in all cases (eg.
what if the current temperature is still above the given trip at this
point?).

The thermal_zone_device_init() will reset current temperature with THERMAL_TEMP_INVALID. Then the following API
thandle_thermal_trip() doesn't call get_temp to sensor driver again instead it will use this reset temperature
value for each trip re-evaluation. Hence for above mentioned case, I think it will clear the throttling
irrespective of what is the actual current temperature at that instant. Otherwise please correct me

May I know is there any other possible cases where throttling will not go away completely ?

+
thermal_notify_tz_disable(tz->id);
+ }

return ret;
}