Re: [PATCH v10 14/17] cxl/pci: Introduce CXL Endpoint protocol error handlers

From: Bowman, Terry
Date: Tue Jul 22 2025 - 14:23:29 EST




On 7/21/2025 5:35 PM, Dave Jiang wrote:
>
> On 6/26/25 3:42 PM, Terry Bowman wrote:
>> CXL Endpoint protocol errors are currently handled using PCI error
>> handlers. The CXL Endpoint requires CXL specific handling in the case of
>> uncorrectable error (UCE) handling not provided by the PCI handlers.
>>
>> Add CXL specific handlers for CXL Endpoints. Rename the existing
>> cxl_error_handlers to be pci_error_handlers to more correctly indicate
>> the error type and follow naming consistency.
>>
>> The PCI handlers will be called if the CXL device is not trained for
>> alternate protocol (CXL). Update the CXL Endpoint PCI handlers to call the
>> CXL UCE handlers.
> Would the CXL device still be functional if it can't train the CXL protocols? Just wondering if we still need the standard PCI handlers in that case at all.
>
> DJ

A CXL EP failing training will not support CXL functionality.

Once training fails the RAS registers may be unavailable. I'm concerned accesses to the
MMIO RAS registers could possibly cause a MCE if the PCIe device doesn't respond. It will
depend on how the training fails. This a reason to remove the PCIe handlers.

BTW, the AER status will be logged by the AER driver before a PCIe handler is called.

A while back Dan mentioned we should leave the PCIe EP handlers. He may have an opinion
or more to add.

-Terry

[snip]