Re: [PATCH v3] interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate

From: Stephen Boyd
Date: Wed Apr 06 2022 - 12:07:03 EST


Quoting Stephen Boyd (2022-04-05 16:00:55)
> Quoting Georgi Djakov (2021-11-25 09:47:51)
> > From: Mike Tipton <mdtipton@xxxxxxxxxxxxxx>
> >
> > We're only adding BCMs to the commit list in aggregate(), but there are
> > cases where pre_aggregate() is called without subsequently calling
> > aggregate(). In particular, in icc_sync_state() when a node with initial
> > BW has zero requests. Since BCMs aren't added to the commit list in
> > these cases, we don't actually send the zero BW request to HW. So the
> > resources remain on unnecessarily.
> >
> > Add BCMs to the commit list in pre_aggregate() instead, which is always
> > called even when there are no requests.
> >
> > Signed-off-by: Mike Tipton <mdtipton@xxxxxxxxxxxxxx>
> > [georgi: remove icc_sync_state for platforms with incomplete support]
> > Signed-off-by: Georgi Djakov <djakov@xxxxxxxxxx>
>
> This patch fixes suspend/resume for me on sc7180-trogdor-lazor. Without
> it I can't achieve XO shutdown. It seems that it fixes the sync_state
> support that was added in commit b1d681d8d324 ("interconnect: Add sync
> state support"). Before that commit suspend worked because the
> interconnect wasn't maxed out at boot. After that commit we started
> maxing out the interconnect state and never dropping it.

I'm also wondering if this means suspend doesn't work without sync_state
support? Does this mean that device links are required? And device links
are only made if fw_devlink is enabled? I don't see any devlinks made in
drivers/interconnect so I worry that we have to use fw_devlink to get
device links made to make sync_state happen to remove the max votes that
are put in at boot.

>
> It would be good to pick this back to stable kernels so we have a
> working suspend/resume on LTS kernels. I tried picking it back to
> 5.10.109 (latest 5.10 LTS) and booting it on my Lazor w/ LTE device but
> it crashes at boot pretty reliably in the IPA driver. Interestingly I
> can't get it to crash on 5.15.32 when I pick it back, so maybe something
> has changed between 5.10 and 5.15 for IPA? I'll try to bisect it.

Bisecting pointed to commit 1aac309d3207 ("net: ipa: use autosuspend")
as fixing it. I think before that commit we weren't enabling some
interconnect, but now we're booting, runtime suspending, and then
runtime resuming again. With the sync state patch I suspect the
interconnect bandwidth is dropped and IPA needs to use runtime PM to
actually turn resources back on because it assumed that resources are on
when it probes.