Re: [PATCH v3] thermal/core: Clear all mitigation when thermal zone is disabled

From: Thara Gopinath
Date: Mon Jan 10 2022 - 12:55:34 EST


Hi Manaf,

On 1/7/22 1:56 PM, Manaf Meethalavalappu Pallikunhi wrote:
Whenever a thermal zone is in trip violated state, there is a chance
that the same thermal zone mode can be disabled either via thermal
core API or via thermal zone sysfs. Once it is disabled, the framework
bails out any re-evaluation of thermal zone. It leads to a case where
if it is already in mitigation state, it will stay the same state
until it is re-enabled.

To avoid above mentioned issue, on thermal zone disable request
reset thermal zone and clear mitigation for each trip explicitly.

Signed-off-by: Manaf Meethalavalappu Pallikunhi <quic_manafm@xxxxxxxxxxx>
---
drivers/thermal/thermal_core.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 51374f4..e288c82 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -447,10 +447,18 @@ static int thermal_zone_device_set_mode(struct thermal_zone_device *tz,
thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
- if (mode == THERMAL_DEVICE_ENABLED)
+ if (mode == THERMAL_DEVICE_ENABLED) {
thermal_notify_tz_enable(tz->id);
- else
+ } else {
+ int trip;
+
+ /* make sure all previous throttlings are cleared */
+ thermal_zone_device_init(tz);

It looks weird to do a init when you are actually disabling the thermal zone.


+ for (trip = 0; trip < tz->trips; trip++)
+ handle_thermal_trip(tz, trip);

So this is exactly what thermal_zone_device_update does except that thermal_zone_device_update checks for the mode and bails out if the zone is disabled.
This will work because as you explained in v2, the temperature is reset in thermal_zone_device_init and handle_thermal_trip will remove the mitigation if any.

My two cents here (Rafael and Daniel can comment more on this).

I think it will be cleaner if we can have a third mode THERMAL_DEVICE_DISABLING and have thermal_zone_device_update handle clearing the mitigation. So this will look like
if (mode == THERMAL_DEVICE_DISABLED)
tz->mode = THERMAL_DEVICE_DISABLING;
else
tz->mode = mode;

thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);

if (mode == THERMAL_DEVICE_DISABLED)
tz->mode = mode;

You will have to update update_temperature to set tz->temperature = THERMAL_TEMP_INVALID and thermal_zone_set_trips to set tz->prev_low_trip = -INT_MAX and tz->prev_high_trip = INT_MAX for
THERMAL_DEVICE_DISABLING mode.

--
Warm Regards
Thara (She/Her/Hers)
+
thermal_notify_tz_disable(tz->id);
+ }
return ret;
}