Re: [PATCH v6 1/2] power: domain: handle genpd correctly when needing interrupts
From: Martin Kepplinger
Date: Mon Sep 26 2022 - 05:53:20 EST
Am Freitag, dem 23.09.2022 um 15:55 +0200 schrieb Ulf Hansson:
> On Thu, 25 Aug 2022 at 09:06, Martin Kepplinger
> <martin.kepplinger@xxxxxxx> wrote:
> >
> > Am Mittwoch, dem 24.08.2022 um 15:30 +0200 schrieb Ulf Hansson:
> > > On Mon, 22 Aug 2022 at 10:38, Martin Kepplinger
> > > <martin.kepplinger@xxxxxxx> wrote:
> > > >
> > > > Am Freitag, dem 19.08.2022 um 16:53 +0200 schrieb Ulf Hansson:
> > > > > On Fri, 19 Aug 2022 at 11:17, Martin Kepplinger
> > > > > <martin.kepplinger@xxxxxxx> wrote:
> > > > > >
> > > > > > Am Dienstag, dem 26.07.2022 um 17:07 +0200 schrieb Ulf
> > > > > > Hansson:
> > > > > > > On Tue, 26 Jul 2022 at 10:33, Martin Kepplinger
> > > > > > > <martin.kepplinger@xxxxxxx> wrote:
> > > > > > > >
> > > > > > > > If for example the power-domains' power-supply node
> > > > > > > > (regulator)
> > > > > > > > needs
> > > > > > > > interrupts to work, the current setup with noirq
> > > > > > > > callbacks
> > > > > > > > cannot
> > > > > > > > work; for example a pmic regulator on i2c, when
> > > > > > > > suspending,
> > > > > > > > usually
> > > > > > > > already
> > > > > > > > times out during suspend_noirq:
> > > > > > > >
> > > > > > > > [ 41.024193] buck4: failed to disable: -ETIMEDOUT
> > > > > > > >
> > > > > > > > So fix system suspend and resume for these power-
> > > > > > > > domains by
> > > > > > > > using
> > > > > > > > the
> > > > > > > > "outer" suspend/resume callbacks instead. Tested on the
> > > > > > > > imx8mq-
> > > > > > > > librem5 board,
> > > > > > > > but by looking at the dts, this will fix imx8mq-evk and
> > > > > > > > possibly
> > > > > > > > many other
> > > > > > > > boards too.
> > > > > > > >
> > > > > > > > This is designed so that genpd providers just say "this
> > > > > > > > genpd
> > > > > > > > needs
> > > > > > > > interrupts" (by setting the flag) - without implying an
> > > > > > > > implementation.
> > > > > > > >
> > > > > > > > Initially system suspend problems had been discussed at
> > > > > > > > https://lore.kernel.org/linux-arm-kernel/20211002005954.1367653-8-l.stach@xxxxxxxxxxxxxx/
> > > > > > > > which led to discussing the pmic that contains the
> > > > > > > > regulators
> > > > > > > > which
> > > > > > > > serve as power-domain power-supplies:
> > > > > > > > https://lore.kernel.org/linux-pm/573166b75e524517782471c2b7f96e03fd93d175.camel@xxxxxxx/T/
> > > > > > > >
> > > > > > > > Signed-off-by: Martin Kepplinger
> > > > > > > > <martin.kepplinger@xxxxxxx>
> > > > > > > > ---
> > > > > > > > drivers/base/power/domain.c | 13 +++++++++++--
> > > > > > > > include/linux/pm_domain.h | 5 +++++
> > > > > > > > 2 files changed, 16 insertions(+), 2 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/base/power/domain.c
> > > > > > > > b/drivers/base/power/domain.c
> > > > > > > > index 5a2e0232862e..58376752a4de 100644
> > > > > > > > --- a/drivers/base/power/domain.c
> > > > > > > > +++ b/drivers/base/power/domain.c
> > > > > > > > @@ -130,6 +130,7 @@ static const struct genpd_lock_ops
> > > > > > > > genpd_spin_ops = {
> > > > > > > > #define genpd_is_active_wakeup(genpd) (genpd->flags &
> > > > > > > > GENPD_FLAG_ACTIVE_WAKEUP)
> > > > > > > > #define genpd_is_cpu_domain(genpd) (genpd->flags &
> > > > > > > > GENPD_FLAG_CPU_DOMAIN)
> > > > > > > > #define genpd_is_rpm_always_on(genpd) (genpd->flags &
> > > > > > > > GENPD_FLAG_RPM_ALWAYS_ON)
> > > > > > > > +#define genpd_irq_on(genpd) (genpd->flags &
> > > > > > > > GENPD_FLAG_IRQ_ON)
> > > > > > > >
> > > > > > > > static inline bool irq_safe_dev_in_sleep_domain(struct
> > > > > > > > device
> > > > > > > > *dev,
> > > > > > > > const struct generic_pm_domain *genpd)
> > > > > > > > @@ -2065,8 +2066,15 @@ int pm_genpd_init(struct
> > > > > > > > generic_pm_domain
> > > > > > > > *genpd,
> > > > > > > > genpd->domain.ops.runtime_suspend =
> > > > > > > > genpd_runtime_suspend;
> > > > > > > > genpd->domain.ops.runtime_resume =
> > > > > > > > genpd_runtime_resume;
> > > > > > > > genpd->domain.ops.prepare = genpd_prepare;
> > > > > > > > - genpd->domain.ops.suspend_noirq =
> > > > > > > > genpd_suspend_noirq;
> > > > > > > > - genpd->domain.ops.resume_noirq =
> > > > > > > > genpd_resume_noirq;
> > > > > > > > +
> > > > > > > > + if (genpd_irq_on(genpd)) {
> > > > > > > > + genpd->domain.ops.suspend =
> > > > > > > > genpd_suspend_noirq;
> > > > > > > > + genpd->domain.ops.resume =
> > > > > > > > genpd_resume_noirq;
> > > > > > > > + } else {
> > > > > > > > + genpd->domain.ops.suspend_noirq =
> > > > > > > > genpd_suspend_noirq;
> > > > > > > > + genpd->domain.ops.resume_noirq =
> > > > > > > > genpd_resume_noirq;
> > > > > > >
> > > > > > > As we discussed previously, I am thinking that it may be
> > > > > > > better
> > > > > > > to
> > > > > > > move to using genpd->domain.ops.suspend_late and
> > > > > > > genpd->domain.ops.resume_early instead.
> > > > > >
> > > > > > Wouldn't that better be a separate patch (on top)? Do you
> > > > > > really
> > > > > > want
> > > > > > me to change the current behaviour (default case) to from
> > > > > > noirq
> > > > > > to
> > > > > > late? Then I'll resend this series with such a patch added.
> > > > >
> > > > > Sorry, I wasn't clear enough, the default behaviour should
> > > > > remain
> > > > > as
> > > > > is.
> > > > >
> > > > > What I meant was, when genpd_irq_on() is true, we should use
> > > > > the
> > > > > genpd->domain.ops.suspend_late and genpd-
> > > > > > domain.ops.resume_early.
> > > >
> > > > Testing that shows that this isn't working. I can provide the
> > > > logs
> > > > later, but suspend fails and I think it makes sense:
> > > > "suspend_late"
> > > > is
> > > > simply already too late when i2c (or any needed driver) uses
> > > > "suspend".
> > >
> > > Okay, I see.
> > >
> > > The reason why I suggested moving the callbacks to
> > > "suspend_late",
> > > was
> > > that I was worried that some of the attached devices to genpd
> > > could
> > > use "suspend_late" themselves. This is the case for some drivers
> > > for
> > > DMA/clock/gpio/pinctrl-controllers, for example. That said, I am
> > > curious to look at the DT files for the platform you are running,
> > > would you mind giving me a pointer?
> >
> > I'm running
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/freescale/imx8mq-librem5.dtsi
> > with these (small) patches on top:
> > https://source.puri.sm/martin.kepplinger/linux-next/-/commits/5.19.3/librem5
>
> Thanks for sharing the information!
>
> >
> > >
> > > So, this made me think about this a bit more. In the end, just
> > > using
> > > different levels (suspend, suspend_late, suspend_noirq) of
> > > callbacks
> > > are just papering over the real *dependency* problem.
> >
> > true, it doesn't feel like a stable solution.
> >
> > >
> > > What we need for the genpd provider driver, is to be asked to be
> > > suspended under the following conditions:
> > > 1. All consumer devices (and child-domains) for its corresponding
> > > PM
> > > domain have been suspended.
> > > 2. All its supplier devices supplies must remain resumed, until
> > > the
> > > genpd provider has been suspended.
> > >
> > > Please allow me a few more days to think in more detail about
> > > this.
> >
> > Thanks a lot for thinking about this!
>
> I have made some more thinking, but it's been a busy period for me,
> so
> unfortunately I need some additional time (another week). It seems
> like I also need to do some prototyping, to convince myself about the
> approach.
>
> So, my apologies for the delay!
to be honest, I'm happy as long as you don't forget about the bug. The
workaround I got (these patches) is solid enough for me to be able to
wait. And I'm happy to always answer specific questions or test a patch
of course.
thanks for the update!
martin