Re: [PATCH v9 04/16] PCI/AER: Dequeue forwarded CXL error

From: Dave Jiang
Date: Tue Jun 17 2025 - 12:24:21 EST




On 6/10/25 9:38 PM, Lukas Wunner wrote:
> On Tue, Jun 10, 2025 at 04:20:53PM -0500, Bowman, Terry wrote:
>> On 6/10/2025 1:07 PM, Bowman, Terry wrote:
>>> On 6/9/2025 11:15 PM, Lukas Wunner wrote:
>>>> On Tue, Jun 03, 2025 at 12:22:27PM -0500, Terry Bowman wrote:
>>>>> --- a/drivers/cxl/core/ras.c
>>>>> +++ b/drivers/cxl/core/ras.c
>>>>> +static int cxl_rch_handle_error_iter(struct pci_dev *pdev, void *data)
>>>>> +{
>>>>> + struct cxl_prot_error_info *err_info = data;
>>>>> + struct pci_dev *pdev_ref __free(pci_dev_put) = pci_dev_get(pdev);
>>>>> + struct cxl_dev_state *cxlds;
>>>>> +
>>>>> + /*
>>>>> + * The capability, status, and control fields in Device 0,
>>>>> + * Function 0 DVSEC control the CXL functionality of the
>>>>> + * entire device (CXL 3.0, 8.1.3).
>>>>> + */
>>>>> + if (pdev->devfn != PCI_DEVFN(0, 0))
>>>>> + return 0;
>>>>> +
>>>>> + /*
>>>>> + * CXL Memory Devices must have the 502h class code set (CXL
>>>>> + * 3.0, 8.1.12.1).
>>>>> + */
>>>>> + if ((pdev->class >> 8) != PCI_CLASS_MEMORY_CXL)
>>>>> + return 0;
>>>>> +
>>>>> + if (!is_cxl_memdev(&pdev->dev) || !pdev->dev.driver)
>>>>> + return 0;
>>>>
>>>> Is the point of the "!pdev->dev.driver" check to ascertain that
>>>> pdev is bound to cxl_pci_driver?
>>>>
>>>> If so, you need to check "if (pdev->driver != &cxl_pci_driver)"
>>>> directly (like cxl_handle_cper_event() does).
>>>>
>>>> That's because there are drivers which may bind to *any* PCI device,
>>>> e.g. vfio_pci_driver.
>>
>> Looking closer to implement this change I find the cxl_pci_driver is
>> defined static in cxl/pci.c and is unavailable to reference in
>> cxl/core/ras.c as-is. Would you like me to export cxl_pci_driver to
>> make available for this check?
>
> I'm not sure you need an export. The consumer you're introducing
> is located in core/ras.c, which is always built-in, never modular,
> hence just making it non-static and adding a declaration to cxlpci.h
> may be sufficient.
>
> An alternative would be to keep it static, but add a non-static helper
> cxl_pci_drv_bound() or something like that.
>
> I'm passing the buck to CXL maintainers for this. :)

I don't have a good solution to this. Moving the declaration of cxl_pci driver to core would be pretty messy. Perhaps doing the dance of calling try_module_get() is less messy? Or maybe Dan has a better idea....

DJ

>
>> The existing class code check guarantees it is a CXL EP. Is it not
>> safe to expect it is bound to a the CXL driver?
>
> Just checking for the pci_dev being bound seems insufficient to me
> because of the vfio_pci_driver case and potentially others.
>
> HTH,
>
> Lukas
>