Re: [PATCH v2] PCI/ASPM: Disable ASPM when save/restore PCI state

From: Dmytro Maluka
Date: Tue Jul 19 2022 - 15:11:55 EST


On 7/18/22 18:21, Dmytro Maluka wrote:
> While we're at it, I'm also wondering why for the basic PCI config (the
> first 256 bytes) Linux on x86 always uses the legacy 0xCF8/0xCFC method
> instead of MMCFG, even if MMCFG is available. The legacy method is
> inherently non-atomic and does require the global lock, while the MMCFG
> method generally doesn't, so using MMCFG would significantly speed up
> PCI config accesses in high-contention scenarios like the parallel
> suspend/resume.
>
> I've tried the below change which forces using MMCFG for the first 256
> bytes, and indeed, it makes suspend/resume of individual PCI devices
> with pm_async=1 almost as fast as with pm_async=0. In particular, it
> fixes the problem with slow GL9750 suspend/resume even without Victor's
> patch.
>
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -40,20 +40,20 @@ const struct pci_raw_ops *__read_mostly raw_pci_ext_ops;
> int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn,
> int reg, int len, u32 *val)
> {
> - if (domain == 0 && reg < 256 && raw_pci_ops)
> - return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
> if (raw_pci_ext_ops)
> return raw_pci_ext_ops->read(domain, bus, devfn, reg, len, val);
> + if (domain == 0 && reg < 256 && raw_pci_ops)
> + return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
> return -EINVAL;
> }
>
> int raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn,
> int reg, int len, u32 val)
> {
> - if (domain == 0 && reg < 256 && raw_pci_ops)
> - return raw_pci_ops->write(domain, bus, devfn, reg, len, val);
> if (raw_pci_ext_ops)
> return raw_pci_ext_ops->write(domain, bus, devfn, reg, len, val);
> + if (domain == 0 && reg < 256 && raw_pci_ops)
> + return raw_pci_ops->write(domain, bus, devfn, reg, len, val);
> return -EINVAL;
> }
>
>
> Sounds good if I submit a patch like this? (I'm not suggesting it
> instead of Victor's patch, rather as a separate improvement.)

Ok, I found that a similar change was already suggested in the past by
Thomas [1] and got rejected by Linus [2].

Linus' arguments sound reasonable, and I understand that back then the
only known case of an issue with PCI config lock contention was with
Intel PMU counter registers which are in the extended config space
anyway. But now we know another case of such a contention, concerning
the basic config space too, namely: suspending or resuming many PCI
devices in parallel during system suspend/resume.

I've checked that on my box using MMCFG instead of Type 1 (i.e. using my
above patch) reduces the total suspend or resume time by 15-20 ms on
average. (I also had Victor's patch applied all the time, i.e. the ASPM
L1 exit latency issue was already resolved, so my test was about the PCI
lock contention in general.) So, not exactly a major improvement, yet
not exactly a negligible one. Maybe it's worth optimizing, maybe not.

Anyway, that's a bit of digression. Let's focus primarily on Victor's
ASPM patch.

[1]
https://lore.kernel.org/all/tip-b5b0f00c760b6e9673ab79b88ede2f3c7a039f74@xxxxxxxxxxxxxx/

[2]
https://lore.kernel.org/all/CA+55aFwi0tkdugfqNEz6M28RXM2jx6WpaDF4nfA=doUVdZgUNQ@xxxxxxxxxxxxxx/

Thanks,
Dmytro