Re: [PATCH v7 1/4] x86/numa: Fix SRAT lookup of CFMWS ranges with numa_fill_memblks()

From: Borislav Petkov
Date: Fri May 03 2024 - 15:47:13 EST


On Thu, May 02, 2024 at 03:10:09PM +0200, Robert Richter wrote:
> For configurations that have the kconfig option NUMA_KEEP_MEMINFO
> disabled, numa_fill_memblks() only returns with NUMA_NO_MEMBLK (-1).
> SRAT lookup fails then because an existing SRAT memory range cannot be
> found for a CFMWS address range. This causes the addition of a
> duplicate numa_memblk with a different node id and a subsequent page
> fault and kernel crash during boot.
>
> Fix this by making numa_fill_memblks() always available regardless of
> NUMA_KEEP_MEMINFO.
>
> As Dan suggested, the fix is implemented to remove numa_fill_memblks()
> from sparsemem.h and alos using __weak for the function.
>
> Note that the issue was initially introduced with [1]. But since
> phys_to_target_node() was originally used that returned the valid node
> 0, an additional numa_memblk was not added. Though, the node id was
> wrong too, a message is seen then in the logs:
>
> kernel/numa.c: pr_info_once("Unknown target node for memory at 0x%llx, assuming node 0\n",
>
> [1] commit fd49f99c1809 ("ACPI: NUMA: Add a node and memblk for each
> CFMWS not in SRAT")
>
> Suggested-by: Dan Williams <dan.j.williams@xxxxxxxxx>
> Link: https://lore.kernel.org/all/66271b0072317_69102944c@xxxxxxxxxxxxxxxxxxxxxxxxx.notmuch/
> Fixes: 8f1004679987 ("ACPI/NUMA: Apply SRAT proximity domain to entire CFMWS window")
> Cc: Derick Marks <derick.w.marks@xxxxxxxxx>
> Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
> Cc: Alison Schofield <alison.schofield@xxxxxxxxx>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> Reviewed-by: Alison Schofield <alison.schofield@xxxxxxxxx>
> Reviewed-by: Dan Williams <dan.j.williams@xxxxxxxxx>
> Signed-off-by: Robert Richter <rrichter@xxxxxxx>
> ---
> arch/x86/include/asm/sparsemem.h | 2 --
> arch/x86/mm/numa.c | 4 ++--
> drivers/acpi/numa/srat.c | 5 +++++
> include/linux/numa.h | 7 +------
> 4 files changed, 8 insertions(+), 10 deletions(-)

This needs to go through the ACPI tree but because Robert asked and FWIW:

Acked-by: Borislav Petkov (AMD) <bp@xxxxxxxxx>

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette