Re: [RFC PATCH] mm: CONFIG_NR_ZONES_EXTENDED

From: Dan Williams
Date: Mon Feb 29 2016 - 21:06:27 EST


On Mon, Feb 29, 2016 at 4:06 PM, Vlastimil Babka <vbabka@xxxxxxx> wrote:
> On 29.2.2016 18:55, Dan Williams wrote:
>> On Mon, Feb 29, 2016 at 4:33 AM, Vlastimil Babka <vbabka@xxxxxxx> wrote:
>>> On 02/02/2016 06:42 AM, Andrew Morton wrote:
>>>> So if you want ZONE_DMA, you're limited to 512 NUMA nodes?
>>>>
>>>> That seems reasonable.
>>>
>>>
>>> Sorry for the late reply, but it seems that with !SPARSEMEM, or with
>>> SPARSEMEM_VMEMMAP, reducing NUMA nodes isn't even necessary, because
>>> SECTIONS_WIDTH is zero (see the diagrams in linux/page-flags-layout.h). In
>>> my brief tests with 4.4 based kernel with SPARSEMEM_VMEMMAP it seems that
>>> with 1024 NUMA nodes and 8192 CPU's, there's still 7 bits left (i.e. 6 with
>>> CONFIG_NR_ZONES_EXTENDED).
>>>
>>> With the danger of becoming even more complex, could the limit also depend
>>> on CONFIG_SPARSEMEM/VMEMMAP to reflect that somehow?
>>
>> In this case it's already part of the equation because:
>>
>> config ZONE_DEVICE
>> depends on MEMORY_HOTPLUG
>> depends on MEMORY_HOTREMOVE
>>
>> ...and those in turn depend on SPARSEMEM.
>
> Fine, but then SPARSEMEM_VMEMMAP should be still an available subvariant of
> SPARSEMEM with SECTION_WIDTH=0.

It should be, but not for the ZONE_DEVICE case. ZONE_DEVICE depends
on x86_64 which means ZONE_DEVICE also implies SPARSEMEM_VMEMMAP
since:

config ARCH_SPARSEMEM_ENABLE
def_bool y
depends on X86_64 || NUMA || X86_32 || X86_32_NON_STANDARD
select SPARSEMEM_STATIC if X86_32
select SPARSEMEM_VMEMMAP_ENABLE if X86_64

Now, if a future patch wants to reclaim page flags space for other
usages outside of ZONE_DEVICE it can do the work to handle the
SPARSEMEM_VMEMMAP=n case. I don't see a reason to fold that
distinction into the current patch given the current constraints.