Re: [PATCH RFC 1/2] mm/memory_hotplug: no need to init new pgdat with node_start_pfn

From: Michal Hocko
Date: Wed Apr 22 2020 - 07:01:32 EST


On Wed 22-04-20 10:32:32, David Hildenbrand wrote:
> On 22.04.20 10:21, Michal Hocko wrote:
> > On Tue 21-04-20 15:06:20, David Hildenbrand wrote:
> >> On 21.04.20 14:52, Michal Hocko wrote:
> >>> On Tue 21-04-20 14:35:12, David Hildenbrand wrote:
> >>>> On 21.04.20 14:30, Michal Hocko wrote:
> >>>>> Sorry for the late reply
> >>>>>
> >>>>> On Thu 16-04-20 12:47:06, David Hildenbrand wrote:
> >>>>>> A hotadded node/pgdat will span no pages at all, until memory is moved to
> >>>>>> the zone/node via move_pfn_range_to_zone() -> resize_pgdat_range - e.g.,
> >>>>>> when onlining memory blocks. We don't have to initialize the
> >>>>>> node_start_pfn to the memory we are adding.
> >>>>>
> >>>>> You are right that the node is empty at this phase but that is already
> >>>>> reflected by zero present pages (hmm, I do not see spanned pages to be
> >>>>> set 0 though). What I am missing here is why this is an improvement. The
> >>>>> new node is already visible here and I do not see why we hide the
> >>>>> information we already know.
> >>>>
> >>>> "information we already know" - no, not before we online the memory.
> >>>
> >>> Is this really the case? All add_memory_resource users operate on a
> >>> physical memory range.
> >>
> >> Having the first add_memory() to magically set node_start_pfn of a hotplugged
> >> node isn't dangerous, I think we agree on that. It's just completely
> >> unnecessary here and at least left me confused why this is needed at all-
> >> because the node start/end pfn is only really touched when
> >> onlining/offlining memory (when resizing the zone and the pgdat).
> >
> > I do not see any specific problem. It just feels odd to
> > ignore the start pfn when we have that information. I am little bit
> > worried that this might kick back. E.g. say we start using the memmaps
> > from the hotplugged memory then the initial part of the node will never> get online and we would have memmaps outside of the node span. I do not
>
> That's a general issue, which I pointed out as response to Oscars last
> series. This needs more thought and reworks, especially how
> node_start_pfn/node_spanned_pages are glued to memory onlining/offlining
> today.
>
> > see an immediate problem except for the feeling this is odd.
>
> I think it's inconsistent. E.g., start with memory-less/cpu-less node
> and don't online memory from the kernel immediately.
>
> Hotplug CPU. PGDAT initialized with node_start_pfn=0. Hotplug memory.
> -> node_start_pfn=0 until memory is actually onlined.
>
> Hotplug memory. PGDAT initialized with node_start_pfn=$VALUE. Hotplug CPU.
> -> node_start_pfn=$VALUE
>
> Hotplug memory. PGDAT initialized with node_start_pfn=$VALUE. Hotplug
> CPU. Hotunplug memory.
> -> node_start_pfn=$VALUE, although there is no memory anymore.
>
> Hotplug memory 1. PGDAT initialized with node_start_pfn=$VALUE. Hotplug
> memory 2. Hotunplug memory 2.
> -> node_start_pfn=$VALUE1 instead of $VALUE2.
>
>
> Again, because node_start_pfn has absolutely no meaning until memory is
> actually onlined - today.
>
> >
> > That being said I will shut up now and leave it alone.
>
> Is that a nack?

No it's not. Nor I am going to ack this but I will not stand in the
way. I would just urge to have as many assumptions you are making and as
much information in the changelog as possible.

--
Michal Hocko
SUSE Labs