Re: [PATCH] arm64: dts: qcom: qrb5165-rb5: Disable cpuidle states

From: Amit Pundir
Date: Tue Oct 25 2022 - 07:24:35 EST


On Fri, 21 Oct 2022 at 18:33, Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
>
> On Thu, 20 Oct 2022 at 18:16, Sudeep Holla <sudeep.holla@xxxxxxx> wrote:
> >
> > On Thu, Oct 20, 2022 at 04:40:15PM +0200, Ulf Hansson wrote:
> > > On Thu, 20 Oct 2022 at 16:09, Amit Pundir <amit.pundir@xxxxxxxxxx> wrote:
> > > >
> > > > On Thu, 20 Oct 2022 at 15:01, Sudeep Holla <sudeep.holla@xxxxxxx> wrote:
> > > > >
> > > > > On Wed, Oct 19, 2022 at 01:57:34PM +0200, Ulf Hansson wrote:
> > > > > > On Tue, 18 Oct 2022 at 16:53, Amit Pundir <amit.pundir@xxxxxxxxxx> wrote:
> > > > > > >
> > > > > > > Disable cpuidle states for RB5. These cpuidle states
> > > > > > > made the device highly unstable and it runs into the
> > > > > > > following crash frequently:
> > > > > > >
> > > > > > > [ T1] vreg_l11c_3p3: failed to enable: -ETIMEDOUT
> > > > > > > [ T1] qcom-rpmh-regulator 18200000.rsc:pm8150l-rpmh-regulators: ldo11: devm_regulator_register() failed, ret=-110
> > > > > > > [ T1] qcom-rpmh-regulator: probe of 18200000.rsc:pm8150l-rpmh-regulators failed with error -110
> > > > > > >
> > > > > > > Fixes: 32bc936d7321 ("arm64: dts: qcom: sm8250: Add cpuidle states")
> > > > > > > Signed-off-by: Amit Pundir <amit.pundir@xxxxxxxxxx>
> > > > > > > ---
> > > > > > > arch/arm64/boot/dts/qcom/qrb5165-rb5.dts | 8 ++++++++
> > > > > > > 1 file changed, 8 insertions(+)
> > > > > > >
> > > > > > > diff --git a/arch/arm64/boot/dts/qcom/qrb5165-rb5.dts b/arch/arm64/boot/dts/qcom/qrb5165-rb5.dts
> > > > > > > index cc003535a3c5..f936c41bfbea 100644
> > > > > > > --- a/arch/arm64/boot/dts/qcom/qrb5165-rb5.dts
> > > > > > > +++ b/arch/arm64/boot/dts/qcom/qrb5165-rb5.dts
> > > > > > > @@ -251,6 +251,14 @@ qca639x: qca639x {
> > > > > > >
> > > > > > > };
> > > > > > >
> > > > > > > +&LITTLE_CPU_SLEEP_0 {
> > > > > > > + status = "disabled";
> > > > > > > +};
> > > > > > > +
> > > > > > > +&BIG_CPU_SLEEP_0 {
> > > > > > > + status = "disabled";
> > > > > > > +};
> > > > > > > +
> > > > > > > &adsp {
> > > > > > > status = "okay";
> > > > > > > firmware-name = "qcom/sm8250/adsp.mbn";
> > > > > > > --
> > > > > > > 2.25.1
> > > > > >
> > > > > > Disabling the CPU idlestates, will revert us back to using only the WFI state.
> > > > > >
> > > > > > An option that probably works too is to just drop the idlestate for
> > > > > > the CPU cluster. Would you mind trying the below and see if that works
> > > > > > too?
> > > > > >
> > > > >
> > > > > Indeed this is was I suggested to check initially. But I was surprised to
> > > > > see IIUC, Amit just disabled CPU states with above change and got it working.
> > > > > So it is not cluster state alone causing the issue, is it somehow presence
> > > > > of both cpu and cluster states ? Am I missing something here.
> > > > >
> > > > > > diff --git a/arch/arm64/boot/dts/qcom/sm8250.dtsi
> > > > > > b/arch/arm64/boot/dts/qcom/sm8250.dtsi
> > > > > > index c32227ea40f9..c707a49e8001 100644
> > > > > > --- a/arch/arm64/boot/dts/qcom/sm8250.dtsi
> > > > > > +++ b/arch/arm64/boot/dts/qcom/sm8250.dtsi
> > > > > > @@ -700,7 +700,6 @@ CPU_PD7: cpu7 {
> > > > > >
> > > > > > CLUSTER_PD: cpu-cluster0 {
> > > > > > #power-domain-cells = <0>;
> > > > > > - domain-idle-states = <&CLUSTER_SLEEP_0>;
> > > > >
> > > > > How about just marking CLUSTER_SLEEP_0 state disabled ? That looks cleaner
> > > > > than deleting this domain-idle-states property here. Also not sure if DTS
> > > > > warnings will appear if you delete this ?
> > > >
> > > > Hi, I did try disabling CLUSTER_SLEEP_0: cluster-sleep-0 {} in
> > > > domain-idle-states {} but that didn't help. That's why I end up
> > > > disabling individual cpu states in idle-states {}.
> > >
> > > Yep, this boils down to the fact that genpd doesn't check whether the
> > > domain-idle-state is disabled by using of_device_is_available(). See
> > > genpd_iterate_idle_states().
> > >
> >
> > Yes I found that but can't that be fixed with a simple patch like below ?
>
> Sure, yes it can.
>
> Although, it does complicate things a bit, as we would need two
> patches instead of one, to get things working.
>
> >
> > > That said, I suggest we go with the above one-line change. It may not
> > > be as clean as it could be, but certainly easy to revert when the
> > > support for it has been added in a newer kernel.
> > >
> >
> > I don't like removing the state. It means it doesn't have the state rather
> > than i"it has state but is not working and hence disabled".
> >
> > Will handling the availability of the state cause any issues ?
>
> No, this works fine. It's already been proven by Amit's test.
>
> >
> > Regards,
> > Sudeep
> >
> > -->8
> >
> > diff --git i/drivers/base/power/domain.c w/drivers/base/power/domain.c
> > index ead135c7044c..6471b559230e 100644
> > --- i/drivers/base/power/domain.c
> > +++ w/drivers/base/power/domain.c
> > @@ -2952,6 +2952,10 @@ static int genpd_iterate_idle_states(struct device_node *dn,
> > np = it.node;
> > if (!of_match_node(idle_state_match, np))
> > continue;
> > +
> > + if (!of_device_is_available(np))
> > + continue;
> > +
> > if (states) {
> > ret = genpd_parse_state(&states[i], np);
> > if (ret) {
> >
>
> The above code looks correct to me. Anyone that wants to submit the
> patches? Otherwise I can try to manage it...

Just out of curiosity, I gave this patch a test run and, as Ulf also
mentioned above, this patch alone is not enough to fix the boot
regression I see on RB5.

Regards,
Amit Pundir

>
> Kind regards
> Uffe