Re: [PATCH v9 4/8] PM: domains: Add get_performance_state() callback

From: Dmitry Osipenko
Date: Fri Aug 27 2021 - 11:50:47 EST


27.08.2021 17:23, Ulf Hansson пишет:
> On Fri, 27 Aug 2021 at 03:37, Dmitry Osipenko <digetx@xxxxxxxxx> wrote:
>>
>> Add get_performance_state() callback that retrieves and initializes
>> performance state of a device attached to a power domain. This removes
>> inconsistency of the performance state with hardware state.
>
> Can you please try to elaborate a bit more on the use case. Users need
> to know when it makes sense to implement the callback - and so far we
> tend to document this through detailed commit messages.
>
> Moreover, please state that implementing the callback is optional.

Noted

>> Signed-off-by: Dmitry Osipenko <digetx@xxxxxxxxx>
>> ---
>> drivers/base/power/domain.c | 32 +++++++++++++++++++++++++++++---
>> include/linux/pm_domain.h | 2 ++
>> 2 files changed, 31 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
>> index 3a13a942d012..8b828dcdf7f8 100644
>> --- a/drivers/base/power/domain.c
>> +++ b/drivers/base/power/domain.c
>> @@ -2700,15 +2700,41 @@ static int __genpd_dev_pm_attach(struct device *dev, struct device *base_dev,
>> goto err;
>> } else if (pstate > 0) {
>> ret = dev_pm_genpd_set_performance_state(dev, pstate);
>> - if (ret)
>> + if (ret) {
>> + dev_err(dev, "failed to set required performance state for power-domain %s: %d\n",
>> + pd->name, ret);
>
> Moving the dev_err() here, leads to that we won't print an error if
> of_get_required_opp_performance_state() fails, a few lines above, is
> that intentional?

Not intentional, I'll add another message.

>> goto err;
>> + }
>> dev_gpd_data(dev)->default_pstate = pstate;
>> }
>> +
>> + if (pd->get_performance_state && !dev_gpd_data(dev)->default_pstate) {
>> + bool dev_suspended = false;
>> +
>> + ret = pd->get_performance_state(pd, base_dev, &dev_suspended);
>> + if (ret < 0) {
>> + dev_err(dev, "failed to get performance state for power-domain %s: %d\n",
>> + pd->name, ret);
>> + goto err;
>> + }
>> +
>> + pstate = ret;
>> +
>> + if (dev_suspended) {
>
> The dev_suspended thing looks weird.
>
> Perhaps it was needed before dev_pm_genpd_set_performance_state()
> didn't check pm_runtime_disabled()?

There are two possible variants here:

1. Device is suspended
2. Device is active

If device is suspended, then it will be activated on RPM-resume and h/w
state will require a specific performance state when resumed. Hence only
the the rpm_pstate should be set, otherwise SoC may start to consume
extra power if device won't be resumed by a consumer driver and
performance state is bumped without a real need.

If device is known to be active, then the performance state should be
updated immediately, otherwise we have inconsistent state with hardware.

For Tegra dev_suspended=true because in general it should be safe to
assume that hardware is suspended since it's either stopped by the PD
driver on initial power_on or it's assumed to be disabled by a consumer
driver during probe. Technically it's possible to check clock and reset
state of an attached device from the get_performance_state() to find the
real state of device, but it's not necessary to do so far.

I'll add comment to the code.

>> + dev_gpd_data(dev)->rpm_pstate = pstate;
>> + } else if (pstate > 0) {
>> + ret = dev_pm_genpd_set_performance_state(dev, pstate);
>> + if (ret) {
>> + dev_err(dev, "failed to set required performance state for power-domain %s: %d\n",
>> + pd->name, ret);
>> + goto err;
>> + }
>> + }
>> + }
>
> Overall, what we seem to be doing here, is to retrieve a value for an
> initial/default performance state for a device and then we want to set
> it to make sure the vote becomes aggregated and finally set for the
> genpd.
>
> With your suggested change, there are now two ways to get the
> initial/default state. One is through the existing
> of_get_required_opp_performance_state() and the other is by using a
> new genpd callback.
>
> That said, perhaps we would get a bit cleaner code by moving the "get
> initial/default performance state" thingy, into a separate function
> and then call it from here. If this function returns a valid
> performance state, then we should continue to set the state, by
> calling dev_pm_genpd_set_performance_state() and update
> dev_gpd_data(dev)->default_pstate accordingly.
>
> Would that work, do you think?

To be honest, I'm now confused by
of_get_required_opp_performance_state(). It assumes that device is
active all the time while attached and that device is stopped on detach.

If hardware is always-on, then it should be wrong to drop the
performance state on detach.

If hardware isn't always-on, then it might be suspended during
attachment, and thus, only the rpm_pstate should be set. It's also not
guaranteed that consumer driver will suspend device on unbind, leaving
it active on detach, thus it should be wrong to drop performance state
on detach.

Hence I think the default_pstate is a bit out of touch. If this
attach/detach behaviour is specific to QCOM driver/hardware, then maybe
of_get_required_opp_performance_state() should be moved out to a
get_performance_state() of the QCOM PD driver?

I added Rajendra Nayak to explain.

For now we're bailing out if default_pstate is set because it conflicts
with get_performance_state().

But we can factor out the code into a separate function anyways to make
it cleaner a tad.

>> +
>> return 1;
>>
>> err:
>> - dev_err(dev, "failed to set required performance state for power-domain %s: %d\n",
>> - pd->name, ret);
>> genpd_remove_device(pd, dev);
>> return ret;
>> }
>> diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
>> index 67017c9390c8..4f78b31791ae 100644
>> --- a/include/linux/pm_domain.h
>> +++ b/include/linux/pm_domain.h
>> @@ -133,6 +133,8 @@ struct generic_pm_domain {
>> struct dev_pm_opp *opp);
>> int (*set_performance_state)(struct generic_pm_domain *genpd,
>> unsigned int state);
>> + int (*get_performance_state)(struct generic_pm_domain *genpd,
>> + struct device *dev, bool *dev_suspended);
>
> Comparing the ->set_performance_state() callback, which sets a
> performance state for the PM domain (genpd) - this new callback is
> about retrieving the *initial/default* performance state for a
> *device* that gets attached to a genpd.
>
> That said, may I suggest renaming the callback to
> "dev_get_performance_state", or something along those lines.

Noted

>> struct gpd_dev_ops dev_ops;
>> s64 max_off_time_ns; /* Maximum allowed "suspended" time. */
>> ktime_t next_wakeup; /* Maintained by the domain governor */
>> --
>> 2.32.0
>>
>
> Kind regards
> Uffe
>