Re: +memory-hotplug-alloc-page-from-other-node-in-memory-online.patch added to-mm tree

From: KAMEZAWA Hiroyuki
Date: Sun Jul 05 2009 - 19:49:28 EST


On Fri, 3 Jul 2009 17:12:06 +0800
Shaohua Li <shaohua.li@xxxxxxxxx> wrote:

> On Fri, Jul 03, 2009 at 07:55:56AM +0800, KAMEZAWA Hiroyuki wrote:
> > On Thu, 2 Jul 2009 09:31:04 -0400 (EDT)
> > Christoph Lameter <cl@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > > On Thu, 2 Jul 2009, Yasunori Goto wrote:
> > >
> > > > However, I don't enough time for memory hotplug now,
> > > > and they are just redundant functions now.
> > > > If someone create new allocator (and unifying bootmem allocator),
> > > > I'm very glad. :-)
> > >
> > > "Senior"ities all around.... A move like that would require serious
> > > commitment of time. None of us older developers can take that on it
> > > seems.
> > >
> > > Do we need to accept that the zone and page metadata are living on another
> > > node?
> > >
> > I don't think so. Someone should do. I just think I can't do it _now_.
> > (because I have more things to do for cgroup..)
> >
> > And, if not node-hotplug, memmap is allocated from local memory if possible.
> > "We should _never_ allow fallback to other nodes or not" is problem ?
> > I think we should allow fallback.
> > About pgdat, zones, I hope they will be on-cache...
> >
> > Maybe followings are necessary for allocating pgdat/zones from local node
> > at node-hotplug.
> >
> > a) Add new tiny functions to alloacate memory from not-initialized area.
> > allocate pgdat/memmap from here if necessary.
> > b) leave allocated memory from (a) as PG_reserved at onlining.
> > c) There will be "not unpluggable" section after (b). We should show this to
> > users.
> > d) For removal, we have to keep precise trace of PG_reserved pages.
> > e) vmemmap removal, which uses large page for vmemmap, is a problem.
> > edges of section memmap is not aligned to large pages. Then we need
> > some clever trick to handle this.
> >
> > Allocationg memmap from its own section was an idea (I love this) but
> > IBM's 16MB memory section doesn't allow this.
> Adding code for allocation should not be hard, but hard to make the memory
> unpluggable. For example, the vmemmap page table pages can map several
> sections and even several nodes (a pgd page). This will make some sections
> completely not unpluggable if the sections have page table pages.
> Is it possible we can merge the workaround temporarily? Without it, the hotplug
> fails immediately in our side.
>
ZONE_MOVABLE is for that. I wonder current ZONE_MOVABLE interface is not enough.
If section should be removable later, the section should be onlined as ZONE_MOVABLE
as following.

example)
echo removable_online > /sys/devices/system/memory/memoryXXX/online


thx,
-Kame

> Thanks,
> Shaohua
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxxx For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/