Re: [PATCH 3/4] cxl, acpi/hmat: Update CXL access coordinates directly instead of through HMAT
From: Dave Jiang
Date: Thu Aug 14 2025 - 18:59:57 EST
On 8/14/25 3:33 PM, dan.j.williams@xxxxxxxxx wrote:
> Dave Jiang wrote:
>> The current implementation of CXL memory hotplug notifier gets called
>> before the HMAT memory hotplug notifier. The CXL driver calculates the
>> access coordinates (bandwidth and latency values) for the CXL end to
>> end path (i.e. CPU to endpoint). When the CXL region is onlined, the CXL
>> memory hotplug notifier writes the access coordinates to the HMAT target
>> structs. Then the HMAT memory hotplug notifier is called and it creates
>> the access coordinates for the node sysfs attributes.
>
> Perhaps summarize quickly here the before and after of sysfs, so people
> know if they are impacted by this bug, and backporters can verify they
> fixed it?
ok
>
>> The original intent of the 'ext_updated' flag in HMAT handling code was to
>> stop HMAT memory hotplug callback from clobbering the access coordinates
>> after CXL has injected its calculated coordinates and replaced the generic
>> target access coordinates provided by the HMAT table in the HMAT target
>> structs. However the flag is hacky at best and blocks the updates from
>> other CXL regions that are onlined in the same node later on. Remove the
>> 'ext_updated' flag usage and just update the access coordinates for the
>> nodes directly without touching HMAT target data.
>>
>> The hotplug memory callback ordering is changed. Instead of changing CXL,
>> move HMAT back so there's room for the levels rather than have CXL share
>> the same level as SLAB_CALLBACK_PRI. The change will resulting in the CXL
>> callback to be executed after the HMAT callback.
>>
>> With the change, the CXL hotplug memory notifier runs after the HMAT
>> callback. The HMAT callback will create the node sysfs attributes for
>> access coordinates. The CXL callback will write the access coordinates to
>> the now created node sysfs attributes directly and will not pollute the
>> HMAT target values.
>>
>> Fixes: debdce20c4f2 ("cxl/region: Deal with numa nodes not enumerated by SRAT")
>
> Why that one and not?
>
> 067353a46d8c cxl/region: Add memory hotplug notifier for cxl region
I think I grabbed the wrong line for 'git blame'.
>
> It is the ext_updated machinery that is the main problem that messes up
> sysfs, right?
>
> ...and per the backport concern this should be cc: stable as well.
>
> Other than that you can add:
>
> Reviewed-by: Dan Williams <dan.j.williams@xxxxxxxxx>