Re: [PATCH v3 1/3] driver/base: Optimize memory block registration to reduce boot time

From: David Hildenbrand
Date: Thu May 08 2025 - 05:18:52 EST


On 05.05.25 15:24, Mike Rapoport wrote:
On Mon, May 05, 2025 at 10:18:43AM +0200, David Hildenbrand wrote:
On 05.05.25 09:53, Mike Rapoport wrote:
On Mon, May 05, 2025 at 09:38:43AM +0200, David Hildenbrand wrote:
On 05.05.25 09:28, Oscar Salvador wrote:
On Mon, May 05, 2025 at 09:16:48AM +0200, David Hildenbrand wrote:
memory hotplug code never calls register_one_node(), unless I am missing
something.

During add_memory_resource(), we call __try_online_node(nid, false), meaning
we skip register_one_node().

The only caller of __try_online_node(nid, true) is try_online_node(), called
from CPU hotplug code, and I *guess* that is not required.

Well, I guess this is because we need to link the cpus to the node.
register_one_node() has two jobs: 1) register cpus belonging to the node
and 2) register memory-blocks belonging to the node (if any).

Ah, via __register_one_node() ...

I would assume that an offline node

(1) has no memory
(2) has no CPUs

When we *hotplug* either memory or CPUs, and we first online the node, there
is nothing to register. Because if there would be something, the node would
already be online.

In particular, try_offline_node() will only offline a node if

(A) No present pages: No pages are spanned anymore. This includes
offline memory blocks.
(B) No present CPUs.

But maybe there is some case that I am missing ...

I actually hoped you and Oscar know how that stuff works :)

Well, I know how the memory side works, but the CPU side is giving me a hard
time :)


I tried to figure what is going on there and it all looks really convoluted.

Jap ...


So, on boot we have
cpu_up() ->
try_online_node() ->
bails out because all nodes are online (at least on
x86 AFAIU, see 1ca75fa7f19d ("arch/x86/mm/numa: Do
not initialize nodes twice"))
node_dev_init()i ->
register_one_node() ->
this one can use __register_one_node() and loop
over memblock regions.

And for the hotplug/unplug path, it seems that
register_memory_blocks_under_node(MEMINIT_EARLY) is superfluous, because if
a node had memory it wouldn't get offlined, and if we are hotplugging an
node with memory and cpus, memory hotplug anyway calls
register_memory_blocks_under_node_hotplug().

So, IMHO, register_one_node() should not call
register_memory_blocks_under_node() at all, but again, I might have missed
something :)

Hm, but someone has to create these links for the memory blocks.

My understanding that the links for the memory blocks during hotplug are created in

add_memory_resource()
register_memory_blocks_under_node()

Yes, during hotplug it's exactly that, after registering the node + setting it online.


So register_one_node() only calls register_memory_blocks_under_node() when
there are no actual memory resources under that node, isn't it?

Except in early boot. That's why register_one_node() has:

register_memory_blocks_under_node(nid, start_pfn, end_pfn,
MEMINIT_EARLY);

^ early :)
And that is triggered by

node_dev_init()->register_one_node()



Then we can drop the call to register_memory_blocks_under_node() from
register_one_node() and add creation of memory blocks to node_dev_init(),
i.e.

node_dev_init()
for_each_node(nid)
__register_one_node(nid)
for_each_mem_region()
/* create memory block if node matches */

Yes exactly, that makes sense.

--
Cheers,

David / dhildenb