Re: [PATCH] cpufreq: suspend/resume governors with PM notifiers

From: Rafael J. Wysocki
Date: Sun Nov 17 2013 - 16:24:38 EST


On Sunday, November 17, 2013 10:27:43 PM Viresh Kumar wrote:
> On 17 November 2013 20:39, Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
> > On Sunday, November 17, 2013 01:52:15 PM viresh kumar wrote:
>
> >> Do you see anything extra that might stop working?
> >
> > Well, the code would be racy with the patch as is. User space might manipulate
> > the sysfs knobs in parallel with your PM notifiers, for example, and I'm not
> > entirely sure what can happen then. And the lock in there is pointless,
> > because it doesn't prevent any races from happening.
>
> Hmm..
>
> >> A unrelated question here. Why are we offlining CPUs after suspending all the
> >> devices? Because the problem Nishanth mentioned was that he required few
> >> devices, i2c, to be available when CPUs are getting down. And there might be
> >> similar requirements at other places too. Was there any specific bottleneck due
> >> to which it is implemented this way?
> >
> > No, this is because the ACPI spec mandates powering down devices before CPUs
> > during system suspend. The way it is done today, however, I think we don't
> > need to keep that ordering so strictly any more. We definitely don't need to
> > do that on non-ACPI systems.
>
> Okay.. But different behavior for acpi and non-acpi at such core code doesn't
> look great to me.. Lets see if we can work with existing code
>
> > So while I hate the PM notifiers idea (sorry, but that's how it goes), I think
> > it would be OK to suspend *some* devices after disabling CPUs (not all of them,
> > of course).
>
> Hmm...
>
> > And as I said, I think it would be OK to introduce suspend/resume callbacks for
> > CPU devices and use those callbacks to work around the ordering issues, when
> > necessary. The main point is that the changes made for this purpose should
> > only affect systems where they are necessary and not everyone. I don't want
> > to change the way things work today in general in cpufreq too much unless they
> > are plain bugs that affect everyone.
> >
> >> > We may introduce suspend_noirq and resume_noirq for cpu_subsys, for example,
> >> > and handle things from there. Or something similar. But slapping PM notifiers
> >> > on top of the existing code just because it appears to be easy (and making that
> >> > code even more overdesigned than it already is this way) doesn't seem quite
> >> > right.
>
> Okay.. Even these notifiers would be fine for me. To make things more clear
> before I start implementing them:
> - What about implementing syscore's suspend_prepare and post_suspend?

I'm not sure how useful that would be. When would you like to execute those
things?

> - Or you want to extend only CPU subsystems notifiers? What notifiers exactly?
> And at which places we want to issue them from?

Why do we need to use notifiers? What about PM callbacks?

> >> - I have concerns with Tianyu's patch as policies should be better taken care of
> >> in cpufreq core instead of passing them over to governors.
> >
> > Well, this is all too tangled anyway, but quite frankly I'm not sure if it is
> > worth untangling at this point. We're deprecating cpufreq anyway.
>
> I don't really think cpufreq is going anywhere even after we move bits
> into scheduler. All the backend CPUFreq stuff will probably stay as is,
> as they can't go anywhere.
>
> One thing that will surely go away is the governor's layer, which would
> be directly embedded into scheduler.
>
> Lets see how that goes..
>
> >> - Not all platforms have problem with changing frequency during suspend/resume
> >> and so we may not require disabling of governors for all of them. Probably can
> >> add another field based on which we may/may-not disable governors from PM or
> >> syscore notifiers.
> >
> > What exactly is wrong with adding suspend/resume callbacks to cpu_subsys?
>
> Okay, so you were asking about extending following list: CPU_ONLINE,
> CPU_UP_PREPARE, CPU_UP_CANCELED, CPU_DOWN_PREPARE, etc.. to
> include suspend/resume ones as well?

No. Bus types (among other things) may provide suspend/resume callbacks for
handling devices. We have a bus type for CPUs, which is called cpu_subsys
and currently doesn't define any PM callbacks, although it could do that in
principle. Have you investigated that possibility?

> Logically speaking, all existing ones does look correct as they are more or
> less cpu related. But suspend/resume doesn't look any similar, Atleast to me.
>
> Suspend/resume are system's state rather than CPU's.. We aren't suspending
> or resuming CPUs, we are shutting them off.. So I thought maybe syscore ops
> is a better place (which is already used by cpufreq)..

cpufreq uses syscore_ops for the boot CPU only and that admittedly is a hack.
syscore_ops is specifically for things that have to be suspended with only one
CPU online and with interrupts off. I'm not sure how that applies to cpufreq.

Thanks!

--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/