Re: Hot ADD using CXL1.1 host

From: Shesha Sreenivasamurthy
Date: Mon Jan 30 2023 - 16:19:20 EST


The re-configuration does not reset the device. It does re-program the
PCIe DVSEC for CXL Device register (Section 8.1.3 CXL 2.0 spec Pg
258), register (DVSEC vendor ID 0x1E98, DCSEC ID 0x0).

“So you need to dynamically recreate the region, especially if your
step 10 above resets the device.”

Do you mean the DAX region ? If so, I can if the system stays up.
After a few seconds the system crashes. Can the crash be because of a
mismatch between DVSEC information with what kernel was informed by
BIOS during boot (Some ACPI tables ?)


Thanks,
Shesha.


On Mon, Jan 30, 2023 at 12:51 PM Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
>
> [ add linux-cxl@xxxxxxxxxxxxxxx ]
>
> Hi Shesha, I missed this earlier because it does not appear in my
> "linux-cxl" filter. In general, mail to linux-kernel does not get great
> response from domain-specific experts, so I recommend going to the
> domain specific list like linux-cxl@ in this. Comments below:
>
> Shesha Bhushan Sreenivasamurthy wrote:
> > From: Shesha Bhushan Sreenivasamurthy <sheshas@xxxxxxxxxxx>
> > Date: Thursday, January 26, 2023 at 6:05 PM
> > To: linux-kernel@xxxxxxxxxxxxxxx <linux-kernel@xxxxxxxxxxxxxxx>
> > Subject: Hot ADD using CXL1.1 host
> > Hi All,
> >
> > In our setup, the host is a CXL1.1 running fedora 6.1 kernel. This is
> > connected to a Marvell CXL 2.0 Type-3 memory pooling device. The goal
> > for me is to dynamically change the memory configuration without
> > rebooting the host or the memory device.
> >
> > The approach that I am currently taking is to use dax. I configured
> > the memory device to export 8G and the host sees 8G. I am successful
> > to convert the memory from ‘devdax’ to ‘system-ram’ mode so that
> > general application can use it. At this time, I modify the memory on
> > our memory device to export 16G and host crashes in few mins. The
> > steps I followed are the following
> >
> >
> > 1. Configure my memory device to export 8G
> > 2. Boot host. BIOS populates SRAT table with size 8G.
> > 3. daxctl list --regions --devices -u // Shows 8G
> > 4. sudo daxctl reconfigure-device --mode=system-ram dax0.0 -f
> > 5. Use memory in my application
>
> Ok up to this point, no interaction with the CXL enabling. This is just
> the default kernel behavior with a BIOS that applies the EFI_MEMORY_SP
> attribute to an address range.
>
> > 6. ---- RECONFIGURATION PART ----
> > 7. sudo daxctl offline-memory dax0.0
> > 8. sudo daxctl destroy-device dax0.0 -f // All numa node memory mappings are gone
> > 9. sudo sh -c "echo 1 > /sys/bus/pci/devices/0000\:38\:00.0/remove"
>
> Note that this only takes care of the software side, the CXL hardware /
> decoder side is not touched.
>
> > 10. Reconfigure memory device to be 16G
>
> Does this reset the device?
>
> > 11. sudo sh -c "echo 1 > /sys/bus/pci/rescan"
> > * CXL DEVSEC (Cap ID 0x23, DVSEC VendorID 0x1E98, DVSEC-ID: 0x0) shows size to be 16G 😊
> > 12. daxctl list --regions --devices -u
> > * This still shows 8G ☹
>
> Yes, because there is currently no hookup between the CXL subsystem and
> device-dax, but I am working on that:
>
> https://lore.kernel.org/linux-cxl/63d21ce66e5c_ea22229446@xxxxxxxxxxxxxxxxxxxxxxxxx.notmuch/
>
> > 13. System crashes
> >
> > There is a mismatch between what DXL is seeing and what PCI DVSEC is
> > saying. Looks like I am missing some step so that old 8G information
> > is removed from the system. Can someone advise ?
>
> So you need to dynamically recreate the region, especially if your step
> 10 above resets the device.
>
> > Now, I can try the following
> >
> > 1. Power off memory device
> > 2. Power on and boot my host
> > 3. Power on memory device
> > 4. Configure the memory device to have 8G
> > 5. Follow the above 5-12 commands
> >
> > With this, the question I have is – will the host recognize the PCI
> > device as CXL device and run cxl.mem protocol or will it just see it
> > as PCIe device ? Note that the host is CXL1.1.
>
> Does your device support the HDM decoder capability? As it stands the
> driver expects to use HDM decoders for region creation rather than CXL
> DVSEC range registers.
>
> My expectation is that once the ram-region creation work is done you
> should be able to do something like:
>
> cxl disable-region $region
> cxl disable-memdev $memdev
> modprobe -r cxl_pci
> <reconfigure device>
> modprobe cxl_pci
> cxl create-region ...
>
> ...and be back up and running with a new region with the update size.