Re: memory hotremove prototype, take 3

From: Martin J. Bligh
Date: Thu Dec 04 2003 - 12:13:46 EST


>> > My target is somewhat NUMA-ish and fairly large. So I'm not sure if
>> > CONFIG_NONLINEAR fits, but CONFIG_NUMA isn't perfect either.
>>
>> If your target is NUMA, then you really, really need CONFIG_NONLINEAR.
>> We don't support multiple pgdats per node, nor do I wish to, as it'll
>> make an unholy mess ;-). With CONFIG_NONLINEAR, the discontiguities
>> within a node are buried down further, so we have much less complexity
>> to deal with from the main VM. The abstraction also keeps the poor
>> VM engineers trying to read / write the code saner via simplicity ;-)
>
> IIRC, memory is contiguous within a NUMA node. I think Goto-san will
> clarify this issue when his code gets ready. :-)

Right - but then you can't use discontigmem's multiple pgdat's inside
a node to implement hotplug mem for NUMA systems.

> Preallocating struct page array isn't feasible for the target system
> because max memory / min memory ratio is large.
> Our plan is to use the beginning (or the end) of the memory block being
> hotplugged. If a 2GB memory block is added, first ~20MB is used for
> the struct page array for the rest of the memory block.

Right - that makes perfect sense, it just has 2 problems:

1) You end up with a discontiguous mem_map array (fixable by adding a layer
of indirection in the wrapped macros).
2) on 32 bit, it's going to make a mess, as you need to map mem_map
inside the permanently mapped kernel area (aka ZONE_NORMAL+vmalloc space
except in a kind of wierd cornercase I created with remap_numa_kva,
which creates a no-man's land of permanently mapped kernel memory
between ZONE_NORMAL and VMALLOC_RESERVE area for the remapped
lmem_maps from the other nodes).

>> You could just lock the pages, I'd think? I don't see at a glance
>> exactly what you were using this for, but would that work?
>
> I haven't seriously considered to implement vmalloc'd memory, but I
> guess that would be too complicated if not impossible.
> Making kernel threads or interrupt handlers block on memory access
> sound very difficult to me.

Aahh, maybe I understand now. You're saying you don't support hotplugging
ZONE_NORMAL, so you want to restrict vmalloc accesses to the non-hotplugged
areas? In which case things like HIGHPTE will be a nightmare as well ... ;-)
You also need to be very wary of where memlocked pages are allocated from.

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/