Re: [PATCH] PCI: PM: Fix pci_power_up()

From: Rafael J. Wysocki
Date: Tue Oct 15 2019 - 17:19:59 EST


On Tue, Oct 15, 2019 at 9:20 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
>
> On Mon, Oct 14, 2019 at 01:25:00PM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> >
> > There is an arbitrary difference between the system resume and
> > runtime resume code paths for PCI devices regarding the delay to
> > apply when switching the devices from D3cold to D0.
> >
> > Namely, pci_restore_standard_config() used in the runtime resume
> > code path calls pci_set_power_state() which in turn invokes
> > __pci_start_power_transition() to power up the device through the
> > platform firmware and that function applies the transition delay
> > (as per PCI Express Base Specification Revision 2.0, Section 6.6.1).
> > However, pci_pm_default_resume_early() used in the system resume
> > code path calls pci_power_up() which doesn't apply the delay at
> > all and that causes issues to occur during resume from
> > suspend-to-idle on some systems where the delay is required.
> >
> > Since there is no reason for that difference to exist, modify
> > pci_power_up() to follow pci_set_power_state() more closely and
> > invoke __pci_start_power_transition() from there to call the
> > platform firmware to power up the device (in case that's necessary).
> >
> > Fixes: db288c9c5f9d ("PCI / PM: restore the original behavior of pci_set_power_state()")
> > Reported-by: Daniel Drake <drake@xxxxxxxxxxxx>
> > Link: https://lore.kernel.org/linux-pm/CAD8Lp44TYxrMgPLkHCqF9hv6smEurMXvmmvmtyFhZ6Q4SE+dig@xxxxxxxxxxxxxx/T/#m21be74af263c6a34f36e0fc5c77c5449d9406925
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > ---
> >
> > Daniel, please test this one.
> >
> > ---
> > drivers/pci/pci.c | 24 +++++++++++-------------
> > 1 file changed, 11 insertions(+), 13 deletions(-)
> >
> > Index: linux-pm/drivers/pci/pci.c
> > ===================================================================
> > --- linux-pm.orig/drivers/pci/pci.c
> > +++ linux-pm/drivers/pci/pci.c
> > @@ -959,19 +959,6 @@ void pci_refresh_power_state(struct pci_
> > }
> >
> > /**
> > - * pci_power_up - Put the given device into D0 forcibly
> > - * @dev: PCI device to power up
> > - */
> > -void pci_power_up(struct pci_dev *dev)
> > -{
> > - if (platform_pci_power_manageable(dev))
> > - platform_pci_set_power_state(dev, PCI_D0);
> > -
> > - pci_raw_set_power_state(dev, PCI_D0);
> > - pci_update_current_state(dev, PCI_D0);
> > -}
> > -
> > -/**
> > * pci_platform_power_transition - Use platform to change device power state
> > * @dev: PCI device to handle.
> > * @state: State to put the device into.
> > @@ -1154,6 +1141,17 @@ int pci_set_power_state(struct pci_dev *
> > EXPORT_SYMBOL(pci_set_power_state);
> >
> > /**
> > + * pci_power_up - Put the given device into D0 forcibly
>
> Not specifically for this patch, but what does "forcibly" mean?
>
> > + * @dev: PCI device to power up
> > + */
> > +void pci_power_up(struct pci_dev *dev)
> > +{
> > + __pci_start_power_transition(dev, PCI_D0);
> > + pci_raw_set_power_state(dev, PCI_D0);
> > + pci_update_current_state(dev, PCI_D0);
>
> There's not very much difference between:
>
> pci_power_up(dev);
>
> and
>
> pci_set_power_state(dev, PCI_D0);
>
> It looks like the main difference is that pci_set_power_state() calls
> __pci_complete_power_transition(), which ultimately calls
> acpi_pci_set_power_state() (for ACPI systems).

Yes, it does, for power states deeper than D0, which is not the case here.

The main difference is the dev->current_state == state check in
pci_set_power_state(), but in the resume case specifically
dev->current_state == PCI_D0 doesn't matter, because the real power
state of the device may be different.

> So maybe "forcibly" means something like "ignoring any platform power
> management methods"?

It means "go into D0 no matter what the current cached value is".

> It's not obvious to me when we should skip the
> platform stuff or whether the skipping should be done at the high
> level (like calling either pci_power_up() or pci_set_power_state()) or
> at a lower level (e.g., if everybody called pci_set_power_state() and
> it could internally tell whether we're skipping the platform part).

For transitions into D0 __pci_start_power_transition() is the platform
stuff, so we don't skip it and the other things that are present in
pci_set_power_state() and are not there in pci_power_up() are simply
unnecessary for transitions to D0.

> If we could unify the paths as much as possible, that would be nice,
> but if it's not feasible, it's not feasible.

It kind of is, but I'd prefer to do it on top of this patch.

First, the pci_update_current_state() in pci_power_up() can be moved
to pci_pm_default_resume_early() which is the only caller of
pci_power_up(). [The role of that pci_update_current_state() is to
change the current_state value to D3cold if the device is not
accessible (or the platform firmware says that it is D3cold, which may
be the case after a failing attempt to use it to switch the device
over to D0).] Next, if pci_power_up() is modified to return the
return value of pci_raw_set_power_state(), pci_set_power_state() can
be implemented (roughly) as

sanitize the state argument

if (dev->current_state == state)
return 0;

if (state == PCI_D0)
return pci_power_up();

carry out a transition into a deeper power state.

And so pci_power_up() will be used by pci_set_power_state(), for
transitions into D0, and (directly) by pci_pm_default_resume_early().

How does that sound?

> If you'd like me to push this for v5.4, let me know, otherwise you
> can apply my:
>
> Acked-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>

I will, thanks!