Re: [PATCH] cxl/hdm: Fix hdm decoder init by adding COMMIT field check

From: Fan Ni
Date: Wed Mar 22 2023 - 12:45:55 EST


On Fri, Mar 03, 2023 at 02:36:25PM -0800, Dan Williams wrote:
> Fan Ni wrote:
> [..]
> > > I think a separate fix for that crash is needed, can you send the
> > > backtrace? I.e. I worry that crash can be triggered by other means.
> > Hi Dan,
> > See backtrace below.
>
> Thanks, I'll take a look.
>
> [..]
> > > > @@ -710,10 +711,11 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
> > > > base = ioread64_hi_lo(hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(which));
> > > > size = ioread64_hi_lo(hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(which));
> > > > committed = !!(ctrl & CXL_HDM_DECODER0_CTRL_COMMITTED);
> > > > + should_commit = !!(ctrl & CXL_HDM_DECODER0_CTRL_COMMIT);
> > >
> > > This change looks like a good idea in general given the ambiguity of
> > > 'committed'. However just combine the two checks into the @committed
> > > variable with something like this:
> > >
> > > commit_mask = CXL_HDM_DECODER0_CTRL_COMMITTED|CXL_HDM_DECODER0_CTRL_COMMIT;
> > > committed = (ctrl & commit_mask) == commit_mask;
>
> Did you also notice this ^^^ request for a fixed up version of the
> current patch?

Hi Dan,
Jonathan sent out a qemu patch to fix the committed field
reset as below, and the patch fixed the system crash discussed here.
https://lore.kernel.org/linux-cxl/20230322102731.4219-1-Jonathan.Cameron@xxxxxxxxxx/T/#me5283349b37d53abc93904a2428910a2f6a354f6

Do you think we need a separate fix at kernel side to fix the
possible system crash when cxl_dpa_release is called and dpa_res is
null? I have noticed at some location, dpa_res is checked before
calling cxl_dpa_release for example in function cxl_dpa_free, but no guard
from other callers. If it is needed, I have a simple fix and ready
to send out.

Fan