Re: [patch for-3.17] mm, thp: fix collapsing of hugepages on madvise

From: Kirill A. Shutemov
Date: Mon Oct 06 2014 - 11:04:14 EST


On Sat, Oct 04, 2014 at 07:48:04PM -0700, David Rientjes wrote:
> If an anonymous mapping is not allowed to fault thp memory and then
> madvise(MADV_HUGEPAGE) is used after fault, khugepaged will never
> collapse this memory into thp memory.
>
> This occurs because the madvise(2) handler for thp, hugepage_advise(),
> clears VM_NOHUGEPAGE on the stack and it isn't stored in vma->vm_flags
> until the final action of madvise_behavior(). This causes the
> khugepaged_enter_vma_merge() to be a no-op in hugepage_advise() when the
> vma had previously had VM_NOHUGEPAGE set.
>
> Fix this by passing the correct vma flags to the khugepaged mm slot
> handler. There's no chance khugepaged can run on this vma until after
> madvise_behavior() returns since we hold mm->mmap_sem.
>
> It would be possible to clear VM_NOHUGEPAGE directly from vma->vm_flags
> in hugepage_advise(), but I didn't want to introduce special case
> behavior into madvise_behavior(). I think it's best to just let it
> always set vma->vm_flags itself.
>
> Cc: <stable@xxxxxxxxxxxxxxx>
> Reported-by: Suleiman Souhlal <suleiman@xxxxxxxxxx>
> Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx>

Okay, I've looked once again and it seems your approach is better.
Although, I don't like that we need to pass down vma->vm_flags to every
khugepaged_enter() and khugepaged_enter_vma_merge().

My proposal is below. Build-tested only.

And I don't think this is subject for stable@: no crash or serious
misbehaviour. Registering to khugepaged is postponed until first page
fault. Not a big deal.