On 6/17/25 5:39 AM, Neeraj Kumar wrote:
In 84ec985944ef3, For cxl pmem region auto-assembly after endpoint port
probing, cxl_nvd presence was required. And for cxl region persistency,
region creation happens during nvdimm_probe which need the completion
of endpoint probe.
It is therefore refactored cxl pmem region auto-assembly after endpoint
probing to cxl mem probing
The region auto-assembly is moved after the endpoint device is added. However, it's not guaranteed that the endpoint probe has completed and completed successfully. You are just getting lucky timing wise that endpoint probe is completed when the new region discovery is starting.
Return of devm_cxl_add_nvdimm() only adds the nvdimm device and triggers the async probe of nvdimm driver. You have to take the endpoint port lock
and check if the driver is attached to be certain that endpoint probe is done and successful. Therefore moving the region discovery location probably does not do what you think it does.
Maybe take a deeper look at the region discovery code and see how it does retry if things are not present and approach it from that angle?
Signed-off-by: Neeraj Kumar <s.neeraj@xxxxxxxxxxx>
---
drivers/cxl/core/port.c | 38 ++++++++++++++++++++++++++++++++++++++
drivers/cxl/cxl.h | 1 +
drivers/cxl/mem.c | 27 ++++++++++++++++++---------
drivers/cxl/port.c | 38 --------------------------------------
4 files changed, 57 insertions(+), 47 deletions(-)
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 78a5c2c25982..bca668193c49 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1038,6 +1038,44 @@ void put_cxl_root(struct cxl_root *cxl_root)
}
EXPORT_SYMBOL_NS_GPL(put_cxl_root, "CXL");
+static int discover_region(struct device *dev, void *root)
+{
+ struct cxl_endpoint_decoder *cxled;
+ int rc;
+
+ if (!is_endpoint_decoder(dev))
+ return 0;
+
+ cxled = to_cxl_endpoint_decoder(dev);
+ if ((cxled->cxld.flags & CXL_DECODER_F_ENABLE) == 0)
+ return 0;
+
+ if (cxled->state != CXL_DECODER_STATE_AUTO)
+ return 0;
+
+ /*
+ * Region enumeration is opportunistic, if this add-event fails,
+ * continue to the next endpoint decoder.
+ */
+ rc = cxl_add_to_region(root, cxled);
+ if (rc)
+ dev_dbg(dev, "failed to add to region: %#llx-%#llx\n",
+ cxled->cxld.hpa_range.start, cxled->cxld.hpa_range.end);
+
+ return 0;
+}
+
+void cxl_region_discovery(struct cxl_port *port)
+{
+ struct cxl_port *root;
+ struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
+
+ root = &cxl_root->port;
+
+ device_for_each_child(&port->dev, root, discover_region);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_region_discovery, "CXL");
+
I have concerns about adding region related code in core/port.c while the rest of the region code is walled behind CONFIG_CXL_REGION. I think this change needs to go to core/region.c.
DJ