Re: [PATCH v5 0/3] Thermal ACPI APIs for generic trip points

From: Rafael J. Wysocki
Date: Thu Jan 19 2023 - 12:05:38 EST


On Thu, Jan 19, 2023 at 5:58 PM srinivas pandruvada
<srinivas.pandruvada@xxxxxxxxxxxxxxx> wrote:
>
> On Thu, 2023-01-19 at 13:17 +0100, Rafael J. Wysocki wrote:
> > On Thu, Jan 19, 2023 at 12:04 AM srinivas pandruvada
> > <srinivas.pandruvada@xxxxxxxxxxxxxxx> wrote:
> > >
> > > On Wed, 2023-01-18 at 23:14 +0100, Daniel Lezcano wrote:
> > > > On 18/01/2023 22:16, srinivas pandruvada wrote:
> > > > > On Wed, 2023-01-18 at 22:01 +0100, Daniel Lezcano wrote:
> > > > > > On 18/01/2023 21:53, srinivas pandruvada wrote:
> > > > > > > On Wed, 2023-01-18 at 21:00 +0100, Daniel Lezcano wrote:
> > > > > > > > On 18/01/2023 20:16, srinivas pandruvada wrote:
> > > > > > > >
> > > > > > > > [ ... ]
> > > > > > > >
> > > > > > > > > > > But we'd better wait for the thermald test result
> > > > > > > > > > > from
> > > > > > > > > > > Srinvias.
> > > > > > > > > >
> > > > > > > > > > A quick test show that things still work with
> > > > > > > > > > thermald
> > > > > > > > > > and
> > > > > > > > > > these
> > > > > > > > > > changes.
> > > > > > > > >
> > > > > > > > > But I have a question. In some devices trip point
> > > > > > > > > temperature
> > > > > > > > > is
> > > > > > > > > not
> > > > > > > > > static. When hardware changes, we get notification. For
> > > > > > > > > example
> > > > > > > > > INT3403_PERF_TRIP_POINT_CHANGED for INT3403 drivers.
> > > > > > > > > Currently get_trip can get the latest changed value.
> > > > > > > > > But if
> > > > > > > > > we
> > > > > > > > > preregister, we need some mechanism to update them.
> > > > > > > >
> > > > > > > > When the notification INT3403_PERF_TRIP_POINT_CHANGED
> > > > > > > > happens, we
> > > > > > > > call
> > > > > > > > int340x_thermal_read_trips() which in turn updates the
> > > > > > > > trip
> > > > > > > > points.
> > > > > > > >
> > > > > > >
> > > > > > > Not sure how we handle concurrency here when driver can
> > > > > > > freely
> > > > > > > update
> > > > > > > trips while thermal core is using trips.
> > > > > >
> > > > > > Don't we have the same race without this patch ? The thermal
> > > > > > core
> > > > > > can
> > > > > > call get_trip_temp() while there is an update, no ?
> > > > > Yes it is. But I can add a mutex locally here to solve.
> > > > > But not any longer.
> > > > >
> > > > > I think you need some thermal_zone_read_lock/unlock() in core,
> > > > > which
> > > > > can use rcu. Even mutex is fine as there will be no contention
> > > > > as
> > > > > updates to trips will be rare.
> > > >
> > > > I was planning to provide a thermal_trips_update(tz, trips) and
> > > > from
> > > > there handle the locking.
> > > >
> > > > As the race was already existing, can we postpone this change
> > > > after
> > > > the
> > > > generic trip points changes?
> > > I think so.
> >
> > Well, what if this bug is reported by a user and a fix needs to be
> > backported to "stable"?
> >
> > Are we going to backport the whole framework redesign along with it?
> >
> > Or is this extremely unlikely?
> These trips are read at the start of DTT/Thermald and will be read once
> after notification is received. So extremely unlikely.
> But we can add the patch before this series to address this issue,
> which can be marked stable. I can submit this.

Looks reasonable to me.