We'd have to be smart about memory blocks that fall into multiple regions,
but it should be a corner case and doable.
This is a corner case that should be handled regardless of the loop order.
And I don't think it's handled today at all.
If we have a block that crosses node boundaries, current implementation of
register_mem_block_under_node_early() will register it under the first
node.
OTOH, we usually don't expect having a lot of regions, so iterating over
them is probably not a big bottleneck? Anyhow, just wanted to raise it.
There would be at least a region per node and having
for_each_online_node()
for_each_mem_region()
makes the loop O(n²) for no good reason.