Re: [PATCH 14/18] drivers: firmware: psci: Manage runtime PM in the idle path for CPUs

From: Ulf Hansson
Date: Thu Jul 18 2019 - 17:49:50 EST


On Thu, 18 Jul 2019 at 19:41, Lina Iyer <ilina@xxxxxxxxxxxxxx> wrote:
>
> On Thu, Jul 18 2019 at 10:55 -0600, Ulf Hansson wrote:
> >On Thu, 18 Jul 2019 at 15:31, Lorenzo Pieralisi
> ><lorenzo.pieralisi@xxxxxxx> wrote:
> >>
> >> On Thu, Jul 18, 2019 at 12:35:07PM +0200, Ulf Hansson wrote:
> >> > On Tue, 16 Jul 2019 at 17:53, Lorenzo Pieralisi
> >> > <lorenzo.pieralisi@xxxxxxx> wrote:
> >> > >
> >> > > On Mon, May 13, 2019 at 09:22:56PM +0200, Ulf Hansson wrote:
> >> > > > When the hierarchical CPU topology layout is used in DT, let's allow the
> >> > > > CPU to be power managed through its PM domain, via deploying runtime PM
> >> > > > support.
> >> > > >
> >> > > > To know for which idle states runtime PM reference counting is needed,
> >> > > > let's store the index of deepest idle state for the CPU, in a per CPU
> >> > > > variable. This allows psci_cpu_suspend_enter() to compare this index with
> >> > > > the requested idle state index and then act accordingly.
> >> > >
> >> > > I do not see why a system with two CPU CPUidle states, say CPU retention
> >> > > and CPU shutdown, should not be calling runtime PM on CPU retention
> >> > > entry.
> >> >
> >> > If the CPU idle governor did select the CPU retention for the CPU, it
> >> > was probably because the target residency for the CPU shutdown state
> >> > could not be met.
> >>
> >> The kernel does not know what those cpu states represent, so, this is an
> >> assumption you are making and it must be made clear that this code works
> >> as long as your assumption is valid.
> >>
> >> If eg a "cluster" retention state has lower target_residency than
> >> the deepest CPU idle state this assumption is wrong.
> >
> >Good point, you are right. I try to find a place to document this assumption.
> >
> >>
> >> And CPUidle and genPD governor decisions are not synced anyway so,
> >> again, this is an assumption, not a certainty.
> >>
> >> > In this case, there is no point in allowing any other deeper idle
> >> > states for cluster/package/system, since those have even greater
> >> > residencies, hence calling runtime PM doesn't make sense.
> >>
> >> On the systems you are testing on.
> >
> >So what you are saying typically means, that if all CPUs in the same
> >cluster have entered the CPU retention state, on some system the
> >cluster may also put into a cluster retention state (assuming the
> >target residency is met)?
> >
> >Do you know of any systems that has these characteristics?
> >
> Many QCOM SoCs can do that. But with the hardware improving, the
> power-performance benefits skew the results in favor of powering off
> the cluster than keeping the CPU and cluster in retention.
>
> Kevin H and I thought of this problem earlier on. But that is a second
> level problem to solve and definitely to be thought of after we have the
> support for the deepest states in the kernel. We left that out for a
> later date. The idea would have been to setup the allowable state(s) in
> the DT for CPU and cluster state definitions and have the genpd take
> that into consideration when deciding the idle state for the domain.

Thanks for confirming.

This more or less means we need to improve the hierarchical support in
genpd to support more levels, such that it makes sense to have a genpd
governor assigned at more than one level. This doesn't work well
today. As I also have stated, this is on my todo list for genpd.

However, I also agree with your standpoint, that let's start simple to
enable the deepest state as a start with, then we can improve things
on top.

Kind regards
Uffe