Re: [PATCH 2/2] mm, slab: support NUMA policy for large kmalloc
From: Vlastimil Babka
Date: Mon Jun 02 2025 - 04:24:37 EST
On 5/30/25 21:05, Christoph Lameter (Ampere) wrote:
> On Thu, 29 May 2025, Vlastimil Babka wrote:
>
>> On 5/29/25 16:57, Christoph Lameter (Ampere) wrote:
>> > On Thu, 29 May 2025, Vlastimil Babka wrote:
>> >
>> >> The slab allocator observes the task's numa policy in various places
>> >> such as allocating slab pages. Large kmalloc allocations currently do
>> >> not, which seems to be an unintended omission. It is simple to correct
>> >> that, so make ___kmalloc_large_node() behave the same way as
>> >> alloc_slab_page().
>> >
>> > Large kmalloc allocation lead to the use of the page allocator which
>> > implements the NUMA policies for the allocations.
>> >
>> > This patch is not necessary.
>>
>> I'm confused, as that's only true depending on which page allocator entry
>> point you use. AFAICS before this series, it's using
>> alloc_pages_node_noprof() which only does
>>
>>
>> if (nid == NUMA_NO_NODE)
>> nid = numa_mem_id();
>>
>> and no mempolicies.
>
> That is a bug.
>
>> I see this patch as analogical to your commit 1941b31482a6 ("Reenable NUMA
>> policy support in the slab allocator")
>>
>> Am I missing something?
>
> The page allocator has its own NUMA suport.
It has support for respecting a preferred node (or forced node with
__GFP_THISNODE), nodemask, cpusets.
Mempolicy support is arguably outside the page allocator itself -
alloc_pages_mpol() lives in mm/mempolicy.c. Although some generically
looking wrappers alloc_pages() lead to it, while others don't
(__alloc_pages()). It's a kinda mess.
> The patch to reenable NUMA support dealt with an issue within the
> allocator where the memory policies were ignored.
I'd agree in the sense the issue was within the slab allocator, calling the
non-mpol-aware page allocator entry, which was not intended.
> It seems that the error was repeated for large kmalloc allocations.
After some digging, seems you're right and the error was done in commit
c4cab557521a ("mm/slab_common: cleanup kmalloc_large()") in v6.1. I'll add a
Fixes: tag and reword changelog accordingly.
> Instead of respecting memory allocation policies the allocation is forced
> to be local to the node.
It's not forced, but preferred, unless kmalloc() caller itself passes
__GFP_THISNODE. The slab layer doesn't add it.
> The forcing to the node is possible with GFP_THISNODE. The default needs
> to be following memory policies.
>