Re: [PATCH RFC 2/2] mm/memory_hotplug: handle memblocks only with CONFIG_ARCH_KEEP_MEMBLOCK

From: Mike Rapoport
Date: Thu Apr 16 2020 - 13:09:18 EST


On Thu, Apr 16, 2020 at 12:47:07PM +0200, David Hildenbrand wrote:
> The comment in add_memory_resource() is stale: hotadd_new_pgdat() will
> no longer call get_pfn_range_for_nid(), as a hotadded pgdat will simply
> span no pages at all, until memory is moved to the zone/node via
> move_pfn_range_to_zone() - e.g., when onlining memory blocks.
>
> The only archs that care about memblocks for hotplugged memory (either
> for iterating over all system RAM or testing for memory validity) are
> arm64, s390x, and powerpc - due to CONFIG_ARCH_KEEP_MEMBLOCK. Without
> CONFIG_ARCH_KEEP_MEMBLOCK, we can simply stop messing with memblocks.
>
> For s390x, it seems to be fairly easy to avoid CONFIG_ARCH_KEEP_MEMBLOCK.
> arm64 could rework most code (esp., pfn_valid(), valid_phys_addr_range()
> and kexec_file_load()) to not require memblocks for hotplugged
> memory. E.g., as hotplugged memory has no holes and can be identified
> using !early_section(), arm64's variant of pfn_valid() could be reworked
> fairly easily to not require memblocks for hotadded memory. powerpc might
> be more involed.
>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Cc: Michal Hocko <mhocko@xxxxxxxxxx>
> Cc: Baoquan He <bhe@xxxxxxxxxx>
> Cc: Oscar Salvador <osalvador@xxxxxxx>
> Cc: Pankaj Gupta <pankaj.gupta.linux@xxxxxxxxx>
> Cc: Mike Rapoport <rppt@xxxxxxxxxxxxx>
> Cc: Anshuman Khandual <anshuman.khandual@xxxxxxx>
> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>

Acked-by: Mike Rapoport <rppt@xxxxxxxxxxxxx>

> ---
> mm/Kconfig | 3 +++
> mm/memory_hotplug.c | 13 +++++++------
> 2 files changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/mm/Kconfig b/mm/Kconfig
> index c1acc34c1c35..a063fd9cdff4 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -136,6 +136,9 @@ config HAVE_FAST_GUP
> depends on MMU
> bool
>
> +# Don't discard allocated memory used to track "memory" and "reserved" memblocks
> +# after early boot, so it can still be used to test for validity of memory.
> +# Also, memblocks are updated with memory hot(un)plug.
> config ARCH_KEEP_MEMBLOCK
> bool
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 9b15ce465be2..104285ee9ae8 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1020,13 +1020,9 @@ int __ref add_memory_resource(int nid, struct resource *res)
>
> mem_hotplug_begin();
>
> - /*
> - * Add new range to memblock so that when hotadd_new_pgdat() is called
> - * to allocate new pgdat, get_pfn_range_for_nid() will be able to find
> - * this new range and calculate total pages correctly. The range will
> - * be removed at hot-remove time.
> - */
> +#ifdef CONFIG_ARCH_KEEP_MEMBLOCK
> memblock_add_node(start, size, nid);
> +#endif
>
> ret = __try_online_node(nid, false);
> if (ret < 0)
> @@ -1075,7 +1071,9 @@ int __ref add_memory_resource(int nid, struct resource *res)
> /* rollback pgdat allocation and others */
> if (new_node)
> rollback_node_hotadd(nid);
> +#ifdef CONFIG_ARCH_KEEP_MEMBLOCK
> memblock_remove(start, size);
> +#endif
> mem_hotplug_done();
> return ret;
> }
> @@ -1751,8 +1749,11 @@ static int __ref try_remove_memory(int nid, u64 start, u64 size)
> mem_hotplug_begin();
>
> arch_remove_memory(nid, start, size, NULL);
> +
> +#ifdef CONFIG_ARCH_KEEP_MEMBLOCK
> memblock_free(start, size);
> memblock_remove(start, size);
> +#endif
> __release_memory_resource(start, size);
>
> try_offline_node(nid);
> --
> 2.25.1
>

--
Sincerely yours,
Mike.