Re: Hot ADD using CXL1.1 host

From: Dan Williams
Date: Mon Jan 30 2023 - 15:32:10 EST


[ add linux-cxl@xxxxxxxxxxxxxxx ]

Hi Shesha, I missed this earlier because it does not appear in my
"linux-cxl" filter. In general, mail to linux-kernel does not get great
response from domain-specific experts, so I recommend going to the
domain specific list like linux-cxl@ in this. Comments below:

Shesha Bhushan Sreenivasamurthy wrote:
> From: Shesha Bhushan Sreenivasamurthy <sheshas@xxxxxxxxxxx>
> Date: Thursday, January 26, 2023 at 6:05 PM
> To: linux-kernel@xxxxxxxxxxxxxxx <linux-kernel@xxxxxxxxxxxxxxx>
> Subject: Hot ADD using CXL1.1 host
> Hi All,
>
> In our setup, the host is a CXL1.1 running fedora 6.1 kernel. This is
> connected to a Marvell CXL 2.0 Type-3 memory pooling device. The goal
> for me is to dynamically change the memory configuration without
> rebooting the host or the memory device.
>
> The approach that I am currently taking is to use dax. I configured
> the memory device to export 8G and the host sees 8G. I am successful
> to convert the memory from ‘devdax’ to ‘system-ram’ mode so that
> general application can use it. At this time, I modify the memory on
> our memory device to export 16G and host crashes in few mins. The
> steps I followed are the following
>
>
> 1. Configure my memory device to export 8G
> 2. Boot host. BIOS populates SRAT table with size 8G.
> 3. daxctl list --regions --devices -u // Shows 8G
> 4. sudo daxctl reconfigure-device --mode=system-ram dax0.0 -f
> 5. Use memory in my application

Ok up to this point, no interaction with the CXL enabling. This is just
the default kernel behavior with a BIOS that applies the EFI_MEMORY_SP
attribute to an address range.

> 6. ---- RECONFIGURATION PART ----
> 7. sudo daxctl offline-memory dax0.0
> 8. sudo daxctl destroy-device dax0.0 -f // All numa node memory mappings are gone
> 9. sudo sh -c "echo 1 > /sys/bus/pci/devices/0000\:38\:00.0/remove"

Note that this only takes care of the software side, the CXL hardware /
decoder side is not touched.

> 10. Reconfigure memory device to be 16G

Does this reset the device?

> 11. sudo sh -c "echo 1 > /sys/bus/pci/rescan"
> * CXL DEVSEC (Cap ID 0x23, DVSEC VendorID 0x1E98, DVSEC-ID: 0x0) shows size to be 16G 😊
> 12. daxctl list --regions --devices -u
> * This still shows 8G ☹

Yes, because there is currently no hookup between the CXL subsystem and
device-dax, but I am working on that:

https://lore.kernel.org/linux-cxl/63d21ce66e5c_ea22229446@xxxxxxxxxxxxxxxxxxxxxxxxx.notmuch/

> 13. System crashes
>
> There is a mismatch between what DXL is seeing and what PCI DVSEC is
> saying. Looks like I am missing some step so that old 8G information
> is removed from the system. Can someone advise ?

So you need to dynamically recreate the region, especially if your step
10 above resets the device.

> Now, I can try the following
>
> 1. Power off memory device
> 2. Power on and boot my host
> 3. Power on memory device
> 4. Configure the memory device to have 8G
> 5. Follow the above 5-12 commands
>
> With this, the question I have is – will the host recognize the PCI
> device as CXL device and run cxl.mem protocol or will it just see it
> as PCIe device ? Note that the host is CXL1.1.

Does your device support the HDM decoder capability? As it stands the
driver expects to use HDM decoders for region creation rather than CXL
DVSEC range registers.

My expectation is that once the ram-region creation work is done you
should be able to do something like:

cxl disable-region $region
cxl disable-memdev $memdev
modprobe -r cxl_pci
<reconfigure device>
modprobe cxl_pci
cxl create-region ...

...and be back up and running with a new region with the update size.