Re: [PATCH] PCI: PM: Skip devices in D0 for suspend-to-idle

From: Rafael J. Wysocki
Date: Thu Jun 13 2019 - 14:53:59 EST


On Thursday, June 13, 2019 12:14:02 AM CEST Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>
> Commit d491f2b75237 ("PCI: PM: Avoid possible suspend-to-idle issue")
> attempted to avoid a problem with devices whose drivers want them to
> stay in D0 over suspend-to-idle and resume, but it did not go as far
> as it should with that.
>
> Namely, first of all, it is questionable to change the power state
> of a PCI bridge with a device in D0 under it, but that is not
> actively prevented from happening during system-wide PM transitions,
> so use the skip_bus_pm flag introduced by commit d491f2b75237 for
> that.
>
> Second, the configuration of devices left in D0 (whatever the reason)
> during suspend-to-idle need not be changed and attempting to put them
> into D0 again by force may confuse some firmware, so explicitly avoid
> doing that.
>
> Fixes: d491f2b75237 ("PCI: PM: Avoid possible suspend-to-idle issue")
> Reported-by: Kai-Heng Feng <kai.heng.feng@xxxxxxxxxxxxx>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> ---
>
> Tested on Dell XPS13 9360 with no issues.
>
> ---
> drivers/pci/pci-driver.c | 47 +++++++++++++++++++++++++++++++++++------------
> 1 file changed, 35 insertions(+), 12 deletions(-)
>
> Index: linux-pm/drivers/pci/pci-driver.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci-driver.c
> +++ linux-pm/drivers/pci/pci-driver.c
> @@ -524,7 +524,6 @@ static void pci_pm_default_resume_early(
> pci_power_up(pci_dev);
> pci_restore_state(pci_dev);
> pci_pme_restore(pci_dev);
> - pci_fixup_device(pci_fixup_resume_early, pci_dev);
> }
>
> /*
> @@ -842,18 +841,16 @@ static int pci_pm_suspend_noirq(struct d
>
> if (pci_dev->skip_bus_pm) {
> /*
> - * The function is running for the second time in a row without
> + * Either the device is a bridge with a child in D0 below it, or
> + * the function is running for the second time in a row without
> * going through full resume, which is possible only during
> - * suspend-to-idle in a spurious wakeup case. Moreover, the
> - * device was originally left in D0, so its power state should
> - * not be changed here and the device register values saved
> - * originally should be restored on resume again.
> + * suspend-to-idle in a spurious wakeup case. The device should
> + * be in D0 at this point, but if it is a bridge, it may be
> + * necessary to save its state.
> */
> - pci_dev->state_saved = true;
> - } else if (pci_dev->state_saved) {
> - if (pci_dev->current_state == PCI_D0)
> - pci_dev->skip_bus_pm = true;
> - } else {
> + if (!pci_dev->state_saved)
> + pci_save_state(pci_dev);
> + } else if (!pci_dev->state_saved) {
> pci_save_state(pci_dev);
> if (pci_power_manageable(pci_dev))
> pci_prepare_to_sleep(pci_dev);
> @@ -862,6 +859,22 @@ static int pci_pm_suspend_noirq(struct d
> dev_dbg(dev, "PCI PM: Suspend power state: %s\n",
> pci_power_name(pci_dev->current_state));
>
> + if (pci_dev->current_state == PCI_D0) {
> + pci_dev->skip_bus_pm = true;
> + /*
> + * Changing the power state of a PCI bridge with a device in D0
> + * below it is questionable, so avoid doing that by setting the
> + * skip_bus_pm flag for the parent bridge.
> + */
> + if (pci_dev->bus->self)
> + pci_dev->bus->self->skip_bus_pm = true;
> + }
> +
> + if (pci_dev->skip_bus_pm && !pm_suspend_via_firmware()) {
> + dev_dbg(dev, "PCI PM: Skipped\n");
> + goto Fixup;
> + }
> +
> pci_pm_set_unknown_state(pci_dev);
>
> /*
> @@ -909,7 +922,16 @@ static int pci_pm_resume_noirq(struct de
> if (dev_pm_smart_suspend_and_suspended(dev))
> pm_runtime_set_active(dev);
>
> - pci_pm_default_resume_early(pci_dev);
> + /*
> + * In the suspend-to-idle case, devices left in D0 during suspend will
> + * stay in D0, so it is not necessary to restore or update their
> + * configuration here and attempting to put them into D0 again may
> + * confuse some firmware, so avoid doing that.
> + */
> + if (!pci_dev->skip_bus_pm || pm_suspend_via_firmware())
> + pci_pm_default_resume_early(pci_dev);
> +
> + pci_fixup_device(pci_fixup_resume_early, pci_dev);
>
> if (pci_has_legacy_pm_support(pci_dev))
> return pci_legacy_resume_early(dev);
> @@ -1200,6 +1222,7 @@ static int pci_pm_restore_noirq(struct d
> }
>
> pci_pm_default_resume_early(pci_dev);
> + pci_fixup_device(pci_fixup_resume_early, pci_dev);
>
> if (pci_has_legacy_pm_support(pci_dev))
> return pci_legacy_resume_early(dev);
>

Bjorn, please let me know if you have any reservations here.

Since this has been confirmed to fix a reported issue, I'm about to queue it up for 5.2-rc6.

Cheers,
Rafael