Re: [PATCH 1/2] PCI: pciehp: Add support for OS-First Hotplug and AER/DPC

From: Lukas Wunner
Date: Fri Nov 04 2022 - 06:15:45 EST


On Tue, Nov 01, 2022 at 12:07:18AM +0000, Smita Koralahalli wrote:
> The implementation is as follows: On an async remove a DPC is triggered as
> a side-effect along with an MSI to the OS. Determine it's an async remove
> by checking for DPC Trigger Status in DPC Status Register and Surprise
> Down Error Status in AER Uncorrected Error Status to be non-zero. If true,
> treat the DPC event as a side-effect of async remove, clear the error
> status registers and continue with hot-plug tear down routines. If not,
> follow the existing routine to handle AER/DPC errors.

Instead of having the OS recognize and filter Surprise Down events,
it would also be possible to simply set the Surprise Down bit in the
Uncorrectable Error Mask Register. This could be constrained to
Downstream Ports capable of surprise removal, i.e. those where the
is_hotplug_bridge in struct pci_dev is set. And that check and the
register change could be performed in pci_dpc_init().

Have you considered such an alternative approach? If you have, what
was the reason to prefer the more complex solution you're proposing?


> +static void pci_clear_surpdn_errors(struct pci_dev *pdev)
> +{
> + u16 reg16;
> + u32 reg32;
> +
> + pci_read_config_dword(pdev, pdev->dpc_cap + PCI_EXP_DPC_RP_PIO_STATUS, &reg32);
> + pci_write_config_dword(pdev, pdev->dpc_cap + PCI_EXP_DPC_RP_PIO_STATUS, reg32);
> +
> + pci_read_config_word(pdev, PCI_STATUS, &reg16);
> + pci_write_config_word(pdev, PCI_STATUS, reg16);
> +
> + pcie_capability_read_word(pdev, PCI_EXP_DEVSTA, &reg16);
> + pcie_capability_write_word(pdev, PCI_EXP_DEVSTA, reg16);
> +}

I don't understand why PCI_STATUS and PCI_EXP_DEVSTA need to be
touched here?


> +static void pciehp_handle_surprise_removal(struct pci_dev *pdev)

Since this function is located in dpc.c and is strictly called from
other functions in the same file, it should be prefixed dpc_, not
pciehp_.


> + /*
> + * According to Section 6.13 and 6.15 of the PCIe Base Spec 6.0,
> + * following a hot-plug event, clear the ARI Forwarding Enable bit
> + * and AtomicOp Requester Enable as its not determined whether the
> + * next device inserted will support these capabilities. AtomicOp
> + * capabilities are not supported on PCI Express to PCI/PCI-X Bridges
> + * and any newly added component may not be an ARI device.
> + */
> + pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL2,
> + (PCI_EXP_DEVCTL2_ARI | PCI_EXP_DEVCTL2_ATOMIC_REQ));

That looks like a reasonable change, but it belongs in a separate
patch. And I think it should be performed as part of (de-)enumeration,
not as part of DPC error handling. What about Downstream Ports which
are not DPC-capable, I guess the bits should be cleared as well, no?

How about clearing the bits in pciehp_unconfigure_device()?

Thanks,

Lukas