Re: [PATCH v9 04/16] PCI/AER: Dequeue forwarded CXL error

From: Lukas Wunner
Date: Wed Jun 11 2025 - 00:39:21 EST


On Tue, Jun 10, 2025 at 04:20:53PM -0500, Bowman, Terry wrote:
> On 6/10/2025 1:07 PM, Bowman, Terry wrote:
> > On 6/9/2025 11:15 PM, Lukas Wunner wrote:
> >> On Tue, Jun 03, 2025 at 12:22:27PM -0500, Terry Bowman wrote:
> >>> --- a/drivers/cxl/core/ras.c
> >>> +++ b/drivers/cxl/core/ras.c
> >>> +static int cxl_rch_handle_error_iter(struct pci_dev *pdev, void *data)
> >>> +{
> >>> + struct cxl_prot_error_info *err_info = data;
> >>> + struct pci_dev *pdev_ref __free(pci_dev_put) = pci_dev_get(pdev);
> >>> + struct cxl_dev_state *cxlds;
> >>> +
> >>> + /*
> >>> + * The capability, status, and control fields in Device 0,
> >>> + * Function 0 DVSEC control the CXL functionality of the
> >>> + * entire device (CXL 3.0, 8.1.3).
> >>> + */
> >>> + if (pdev->devfn != PCI_DEVFN(0, 0))
> >>> + return 0;
> >>> +
> >>> + /*
> >>> + * CXL Memory Devices must have the 502h class code set (CXL
> >>> + * 3.0, 8.1.12.1).
> >>> + */
> >>> + if ((pdev->class >> 8) != PCI_CLASS_MEMORY_CXL)
> >>> + return 0;
> >>> +
> >>> + if (!is_cxl_memdev(&pdev->dev) || !pdev->dev.driver)
> >>> + return 0;
> >>
> >> Is the point of the "!pdev->dev.driver" check to ascertain that
> >> pdev is bound to cxl_pci_driver?
> >>
> >> If so, you need to check "if (pdev->driver != &cxl_pci_driver)"
> >> directly (like cxl_handle_cper_event() does).
> >>
> >> That's because there are drivers which may bind to *any* PCI device,
> >> e.g. vfio_pci_driver.
>
> Looking closer to implement this change I find the cxl_pci_driver is
> defined static in cxl/pci.c and is unavailable to reference in
> cxl/core/ras.c as-is. Would you like me to export cxl_pci_driver to
> make available for this check?

I'm not sure you need an export. The consumer you're introducing
is located in core/ras.c, which is always built-in, never modular,
hence just making it non-static and adding a declaration to cxlpci.h
may be sufficient.

An alternative would be to keep it static, but add a non-static helper
cxl_pci_drv_bound() or something like that.

I'm passing the buck to CXL maintainers for this. :)

> The existing class code check guarantees it is a CXL EP. Is it not
> safe to expect it is bound to a the CXL driver?

Just checking for the pci_dev being bound seems insufficient to me
because of the vfio_pci_driver case and potentially others.

HTH,

Lukas