Re: [PATCH RFC 1/2] mm/memory_hotplug: no need to init new pgdat with node_start_pfn

From: David Hildenbrand
Date: Wed Apr 22 2020 - 04:32:45 EST


On 22.04.20 10:21, Michal Hocko wrote:
> On Tue 21-04-20 15:06:20, David Hildenbrand wrote:
>> On 21.04.20 14:52, Michal Hocko wrote:
>>> On Tue 21-04-20 14:35:12, David Hildenbrand wrote:
>>>> On 21.04.20 14:30, Michal Hocko wrote:
>>>>> Sorry for the late reply
>>>>>
>>>>> On Thu 16-04-20 12:47:06, David Hildenbrand wrote:
>>>>>> A hotadded node/pgdat will span no pages at all, until memory is moved to
>>>>>> the zone/node via move_pfn_range_to_zone() -> resize_pgdat_range - e.g.,
>>>>>> when onlining memory blocks. We don't have to initialize the
>>>>>> node_start_pfn to the memory we are adding.
>>>>>
>>>>> You are right that the node is empty at this phase but that is already
>>>>> reflected by zero present pages (hmm, I do not see spanned pages to be
>>>>> set 0 though). What I am missing here is why this is an improvement. The
>>>>> new node is already visible here and I do not see why we hide the
>>>>> information we already know.
>>>>
>>>> "information we already know" - no, not before we online the memory.
>>>
>>> Is this really the case? All add_memory_resource users operate on a
>>> physical memory range.
>>
>> Having the first add_memory() to magically set node_start_pfn of a hotplugged
>> node isn't dangerous, I think we agree on that. It's just completely
>> unnecessary here and at least left me confused why this is needed at all-
>> because the node start/end pfn is only really touched when
>> onlining/offlining memory (when resizing the zone and the pgdat).
>
> I do not see any specific problem. It just feels odd to
> ignore the start pfn when we have that information. I am little bit
> worried that this might kick back. E.g. say we start using the memmaps
> from the hotplugged memory then the initial part of the node will never> get online and we would have memmaps outside of the node span. I do not

That's a general issue, which I pointed out as response to Oscars last
series. This needs more thought and reworks, especially how
node_start_pfn/node_spanned_pages are glued to memory onlining/offlining
today.

> see an immediate problem except for the feeling this is odd.

I think it's inconsistent. E.g., start with memory-less/cpu-less node
and don't online memory from the kernel immediately.

Hotplug CPU. PGDAT initialized with node_start_pfn=0. Hotplug memory.
-> node_start_pfn=0 until memory is actually onlined.

Hotplug memory. PGDAT initialized with node_start_pfn=$VALUE. Hotplug CPU.
-> node_start_pfn=$VALUE

Hotplug memory. PGDAT initialized with node_start_pfn=$VALUE. Hotplug
CPU. Hotunplug memory.
-> node_start_pfn=$VALUE, although there is no memory anymore.

Hotplug memory 1. PGDAT initialized with node_start_pfn=$VALUE. Hotplug
memory 2. Hotunplug memory 2.
-> node_start_pfn=$VALUE1 instead of $VALUE2.


Again, because node_start_pfn has absolutely no meaning until memory is
actually onlined - today.

>
> That being said I will shut up now and leave it alone.

Is that a nack?

Thanks for having a look!

--
Thanks,

David / dhildenb