Re: [RFC PATCH] cpufreq: cpufreq-cpu0: do not allow transitions withregulators suspended

From: Nishanth Menon
Date: Tue Nov 12 2013 - 10:11:42 EST


On 11/12/2013 12:03 AM, Viresh Kumar wrote:
> Cc'ing Shawn as well.
>
> Sorry for being really late.. I just forgot about it :(

Thanks for responding :)

>
> On 24 October 2013 23:38, Nishanth Menon <nm@xxxxxx> wrote:
>> For platforms where regulators are used, regulator access tends to be
>> disabled as part of the suspend path. In SMP systems such as OMAP,
>> CPU1 is disabled as post suspend_noirq. This results in the following
>> tail end sequence of actions:
>> cpufreq_cpu_callback gets called with CPU_POST_DEAD
>> __cpufreq_remove_dev_finish is invoked as a result
>> __cpufreq_governor(policy, CPUFREQ_GOV_START) is
>> triggered
>>
>> At this point, with ondemand governor, if the cpu entered suspend path
>> at a lower OPP, this triggers a transition request. However, since
>> irqs are disabled, typically, regulator control using I2C operations
>> are not possible either, regulator operations will naturally fail
>> (even though clk_set_rate might succeed depending on the platform).
>>
>> Unfortunately, cpufreq_driver->suspend|resume is too late as well, since
>> they are invoked as part of syscore_ops (after CPU1 is hotplugged out).
>>
>> The proposal here is to use pm notifier suspend to disable any
>> requests to target, as we may very well expect this to fail at a later
>> suspend sequence.
>
> Yes the problem looks real but there are issues with this patch.
> - It doesn't solve your problem completely, because you returned -EBUSY,
> your suspend operation failed and we resumed immediately.

Seems like there was an error handling miss somewhere - for some
reason, it did suspend properly.

> - We can't make this solution true for everybody using cpu0 driver, it should
> be platform specific.

Agreed.

> - Its not a problem of cpu0 driver but all drivers. So, probably a better idea
> would be not calling ->target() at all when drivers have marked them unusable
> in suspend path..

Ack.

>
> But I think the problem can/should be solved some other way.. Looking closely,
> we got to the problem because we called
>
> __cpufreq_governor(policy, CPUFREQ_GOV_START)
>
> at the first place. This happened because the policy structure had more than
> one cpu to take care of and after stopping goveronr for CPU1 it has to start it
> again for CPU0... But this is really not required as anyway we are going to
> suspend.
>
> Can you try attached patch? I will then repost it formally...

I tried a equivalent of this for v3.12 tag:
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 04548f7..9ec243c 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1186,7 +1186,7 @@ static int __cpufreq_remove_dev_prepare(struct
device *dev,
return -EINVAL;
}

- if (cpufreq_driver->target) {
+ if (cpufreq_driver->target && (!frozen ||
policy->governor_enabled)) {
ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
if (ret) {
pr_err("%s: Failed to stop governor\n", __func__);
@@ -1252,7 +1252,7 @@ static int __cpufreq_remove_dev_finish(struct
device *dev,

/* If cpu is last user of policy, free policy */
if (cpus == 1) {
- if (cpufreq_driver->target) {
+ if (cpufreq_driver->target && !frozen) {
ret = __cpufreq_governor(policy,
CPUFREQ_GOV_POLICY_EXIT);
if (ret) {

And I see http://pastebin.mozilla.org/3528478

with a WARN patch for generating call stack.


Finally squelched warnings with a net diff (v3.12) of
http://pastebin.mozilla.org/3546062

However, ondemand is no longer functioning on resume (governor needs a
start after being unfrozen.. and obviously by avoiding that entirely
in frozen case.. not sure if I missed any other)..

>
> commit 2aecab9be85ceafdbab5f824eec5d1f81f3fa803
> Author: Viresh Kumar <viresh.kumar@xxxxxxxxxx>
> Date: Tue Nov 12 11:26:36 2013 +0530
>
> cpufreq: don't start governor in suspend path
>
> When we suspend our system, we remove all non-boot CPUs one by one. At this
> point we actually STOP/START governor for each non-boot cpu, which
> is a total
> waste of time as we aren't going to use governor until the time we are back.
>
> Also, this is causing problems for some platforms (like OMAP),
> where governor
> tries to change freq of core in suspend path which requires programming
> regulators via I2C which isn't possible then.
>
> So, to make it better for everybody don't start the governor again
> in suspend
> path.
>
> Reported-by: Nishanth Menon <nm@xxxxxx>
> Signed-off-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx>
> ---
> drivers/cpufreq/cpufreq.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 02d534d..bec58cf 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1174,7 +1174,7 @@ static int __cpufreq_remove_dev_prepare(struct
> device *dev,
> return -EINVAL;
> }
>
> - if (has_target()) {
> + if (has_target() && (!frozen || policy->governor_enabled)) {
> ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
> if (ret) {
> pr_err("%s: Failed to stop governor\n", __func__);
> @@ -1282,7 +1282,7 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
> if (!frozen)
> cpufreq_policy_free(policy);
> } else {
> - if (has_target()) {
> + if (has_target() && !frozen) {
> if ((ret = __cpufreq_governor(policy,
> CPUFREQ_GOV_START)) ||
> (ret =
> __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS))) {
> pr_err("%s: Failed to start governor\n",
>
>




--
Regards,
Nishanth Menon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/