Re: +memory-hotplug-alloc-page-from-other-node-in-memory-online.patch added to-mm tree

From: KAMEZAWA Hiroyuki
Date: Wed Jul 01 2009 - 21:24:09 EST


On Thu, 02 Jul 2009 09:11:13 +0800
yakui <yakui.zhao@xxxxxxxxx> wrote:

> On Thu, 2009-07-02 at 01:22 +0800, Christoph Lameter wrote:
> > On Wed, 1 Jul 2009, yakui wrote:
> >
> > > If we can't allocate memory from other node when there is no memory on
> > > this node, we will have to do something like the bootmem allocator.
> > > After the memory page is added to the system memory, we will have to
> > > free the memory space used by the memory allocator. At the same time we
> > > will have to assure that the hot-plugged memory exists physically.
> >
> > The bootmem allocator must stick around it seems. Its more like a node
> > bootstrap allocator then.
> >
> > Maybe we can generalize that. The bootstrap allocator may only need to be
> > able boot one node (which simplifies design). During system bringup only
> > the boot node is brought up.
> >
> > Then the other nodes are hotplugged later all in turn using the bootstrap
> > allocator for their node setup?
> Your idea looks fragrant. But it seems that it is difficult to realize.
> In the boot phase the bootmem allocator is initialized. And after the
> page buddy mechanism is enabled, the memory space used by bootmem
> allocator will be freed.
>
> If we also do the similar thing for the hotplugged node, how and when to
> free the memory space used by the bootstrap allocator? It seems that we
> will have to wait before all the memory sections are onlined for this
> hotplugged node. And before all the memory sections are onlined, the
> bootstrap allocator and buddy page allocator will co-exist.
>

When I was an eager developper of memory hotplug, I planned that.
A special page allocater which works from allocating pgdat until memmap setup.
But there were problems.
example)
1. We wanted to reuse bootmem.c but it was difficult.
2. IBM guys uses 16MB section. Then, they cannot allocate local pgdat/memmap
as other platform which have larger section size.
3. At memory hotplug, "memory section which includes pgdat for a node should be
removed after all other sections on the node are removed"
There is the same problem to memmap.

Because current memory hotplug works sane and above problem was too complicated for
me, I stopped. But there are more NUMAs than we implemented memory hotplug initially.
I hope someone fixes this mis-allocation problem.

IIUC, "3" is the worst problem. It creates dependency among memory.

Thanks,
-Kame







> thanks.
> >
> > There are a couple of things where one would want to spread out memory
> > across the nodes at boot time. How would node hotplugging handle that
> > situation?
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxxx For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/