Re: [PATCH 1/1] numa, mm, memory-hotplug: Do not allocate pagetableto local node with MEMORY_HOTREMOVE enabled.

From: Pekka Enberg
Date: Tue May 21 2013 - 03:02:53 EST


On Thu, May 16, 2013 at 2:50 PM, Tang Chen <tangchen@xxxxxxxxxxxxxx> wrote:
> The following patch-set allocated pagetables to local node.
> https://lkml.org/lkml/2013/4/11/829
>
> Doing this will break memory hot-remove.
>
> Before removing memory, the kernel offlines memory. If offlining
> memory fails, the memory cannot be removed. The pagetables are
> used by the kernel, so they cannot be offlined. Furthermore, they
> cannot be removed.
>
> Of course, we can free pagetable pages because the pagetables of
> the to be removed memory are useless. But offlining memory doesn't
> mean removing memory. If users only want to offline memory, the
> pagetables should not be freed.
>
> The minimum unit of memory online/offline is block. And by default,
> one block contains one section, which by default is 128MB. There is
> possiblity that half of the block is pagetable, and the other half
> is movable memory.
>
> When we offline this kind of block, the status of the block is
> uncertain. We cannot simply free the pagetables in this block because
> they may be used by other online blocks. But when doing memory
> hot-remove, the failure of offlining blocks will break the memory
> hot-remove logic.
>
>
> In order to fix it, we have three solutions:
>
> 1. Reserve the whole block (128MB), making no user can use the rest
> parts of the block. And skip them when offlining memory.
> When all the other blocks are offlined, free the pagetable, and remove
> all the memory.
>
> But we may lose some memory for this purpose. 128MB is a little big
> to waste.
>
>
> 2. Keep this block online. Although the offline operation fails, it is
> OK to remove memory.
>
> But the offline operation will always fail. And generally speaking,
> there are a lot of reasons of offline failing, it is difficult to
> detect if it is OK to remove memory. So we don't suggest this way.
>
>
> 3. Migrate user pages and make this block offline. Offlining memory won't
> stop the kernel using the pagetables stored in them, so it will be OK.
>
> But this will change the semantics of "offline". I'm not sure if we
> can do it in this way.
>
>
> So before we fix this problem, I think we should not allocate pagetables
> to local node when CONFIG_MEMORY_HOTREMOVE is enabled. And recover it when
> we confirm the direction and fix the problem.
>
> This patch is based on
> git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-x86-mm
>
> Any other solution for this problem is welcome.
>
>
> Signed-off-by: Tang Chen <tangchen@xxxxxxxxxxxxxx>

Ugh. Special-casing for CONFIG_MEMORY_HOTPLUG is just begging for
trouble. Were you able to determine which commit broke memory
hot-remove?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/