Re: [PATCH] cxl/port: Disable decoder setup for endpoints in RCD mode

From: Robert Richter
Date: Wed Feb 15 2023 - 11:35:53 EST


On 14.02.23 14:28:51, Dan Williams wrote:
> Robert Richter wrote:
> > Dan,
> >
> > On 09.02.23 09:07:18, Dan Williams wrote:
> > > Robert Richter wrote:
> > > > In RCD mode the HDM decoder capability is optional for endpoints and
> > > > may not exist. The HDM range registers are used instead. Since the
> > > > driver relies on the existence of an HDM decoder capability, its
> > > > absence will cause the initialization of a memory card to fail.
> > > >
> > > > Moreover, the driver also tries to enable or reuse enabled memory
> > > > ranges. In the worst case this may lead to a system hang due to
> > > > disabling system memory that was previously provided and setup by
> > > > system firmware.
> > > >
> > > > To solve the issues described, disable decoder setup for RCD endpoints
> > > > and instead rely exclusively on system firmware to enable those memory
> > > > ranges. Decoders are used by the kernel to setup and configure CXL
> > > > memory regions, esp. to enable and disable them. Since Hot-plug is not
> > > > supported for devices in RCD mode, the ability to disable that memory
> > > > by the kernel using a decoder is not a necessarily requirement,
> > > > decoders are not needed then.
> > > >
> > > > Fixes: 34e37b4c432c ("cxl/port: Enable HDM Capability after validating DVSEC Ranges")
> > > > Signed-off-by: Robert Richter <rrichter@xxxxxxx>
> > >
> > > Does Dave's series address this problem?
> > >
> > > https://lore.kernel.org/linux-cxl/167588394236.1155956.8466475582138210344.stgit@djiang5-mobl3.local/
> > >
> > > ...that is arranging for the driver to carry-on in the absence of the
> > > HDM Decoder Capability.
> >
> > it might only solve the missing hdm decoder capability. I need to take
> > a closer look if that also solves a system hang I was debugging which
> > is caused by clearing the memory disable bit in the hdm dvsec range
> > register. So the best would be to use this patch now to fix decoder
> > initialization in RCD mode and then have Dave's patches on top. I am
> > going to test the series too.
>
> My concern with this patch is that it skips HDM decoder enumeration
> entirely in RCD mode. The CXL cards I have seen are CXL 1.1+ and do
> export the HDM decoder capability.
>
> The driver turns off mem_enable in a few scenarios, one of them indeed
> looks buggy, but does not seem to be the one you addressed. The driver
> should only disable mem if it was also the agent that enabled mem, but
> looks like it does not always do that.
>
> Can you confirm if this fixes this issue?

I have tested the HDM decoder emulation series (v5) and it fixes the
issue. Looking into the paricular change for that, I hope to get a
condensed fix for stable.

Thanks,

-Robert