Re: [PATCH v4 00/22] x86, ACPI, numa: Parse numa info early

From: Tang Chen
Date: Sun May 12 2013 - 22:56:40 EST


Hi Yinghai,

On 05/10/2013 02:24 AM, Yinghai Lu wrote:
So I suggest to separate the job into 2 parts:
1. Push Yinghai's patch1 ~ patch20, without putting pagetable in local
node.
And push my work to use SRAT to arrange ZONE_MOVABLE.
In this case, we can enable memory hotplug in the kernel first.
2. Merge patch21 and patch22 into the fixing work I am doing now, and push
them
together when finished.


no, no, no, please do not half-done work.

Do it right, and Do it clean.


I'm not saying I want to do it half-way. Putting pagetable in local node
will make memory hot-remove patch unable to work.

Before removing pages, the kernel first offlines pages. If the offline logic
fails, the hot-remove cannot work. Since your patches have put node pagetable
in local node at boot time, this memory cannot be offlined, furthermore,
it cannot be hot-removed.

The minimum unit of memory online/offline is block. And by default,
one block contains one section, which by default is 128MB. So if parts
of a block are pagetable, and the rest parts are movable memory, this
block cannot be offlined. And as a result, it cannot be removed.

In order to fix it, we have three solutions:

1. Reserve the whole block (128MB), making no user can use the rest
parts of the block. And skip them when offlining memory.
When all the other blocks are offlined, free the pagetable, and remove
all the memory.

But we may lose some memory for this purpose. 128MB is a little big
to waste.


2. Migrate movable pages and keep this block online. Although the offline
operation fails, it is OK to remove memory.

But the offline operation will always fail. And generally speaking,
there are a lot of reasons of offline failing, it is difficult to
detect if it is OK to remove memory.


3. Migrate user pages and make this block offline, but the kernel can
still use the pagetable in it.

But this will change the semantics of "offline". I'm not sure if we
can do it in this way.


4. Do not allocate pagetable to local node when CONFIG_MEMORY_HOTREMOVE
is enabled. (I do suggest not to put pagetable in local node in
memory hot-remove situation.)


How do you think about these 4 solutions above ?

I think I need some advices for this problem in community. Do you have
any idea to fix this problem if we put pagetable in local node ?

The memory hot-plug guys do want to use memory hot-remove. And I think
for now, we use solution 4 above. When CONFIG_MEMORY_HOTREMOVE is enabled,
do not allocate pagetable to local node.

I'm not trying to do it half-way. When we fix this problem, we can allocate
pagetable to local node again with CONFIG_MEMORY_HOTREMOVE enabled.

Please do give some advices or feedback.



If you have any thinking of this patch-set, please let me know.

Talked to HPA, and he will put my patchset into tip/x86/mm after v3.10-rc1.

after that we can work on put pagetable on local node for hotadd path.


hot-add path is another problem. But I think the hot-remove path is more
urgent now.


Thanks. :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/