Re: [PATCH 0/9] Avoid populating unbounded num of ptes with mmap_sem held

From: Andy Lutomirski
Date: Fri Dec 21 2012 - 19:36:55 EST

On Thu, Dec 20, 2012 at 4:49 PM, Michel Lespinasse <walken@xxxxxxxxxx> wrote:
> We have many vma manipulation functions that are fast in the typical case,
> but can optionally be instructed to populate an unbounded number of ptes
> within the region they work on:
> - mmap with MAP_POPULATE or MAP_LOCKED flags;
> - remap_file_pages() with MAP_NONBLOCK not set or when working on a
> VM_LOCKED vma;
> - mmap_region() and all its wrappers when mlock(MCL_FUTURE) is in effect;
> - brk() when mlock(MCL_FUTURE) is in effect.

Something's buggy here. My evil test case is stuck with lots of
threads spinning at 100% system time. Stack traces look like:

[<0000000000000000>] __mlock_vma_pages_range+0x66/0x70
[<0000000000000000>] __mm_populate+0xf9/0x150
[<0000000000000000>] vm_mmap_pgoff+0x9f/0xc0
[<0000000000000000>] sys_mmap_pgoff+0x7e/0x150
[<0000000000000000>] sys_mmap+0x22/0x30
[<0000000000000000>] system_call_fastpath+0x16/0x1b
[<0000000000000000>] 0xffffffffffffffff

perf top says:

38.45% [kernel] [k] __mlock_vma_pages_range
33.04% [kernel] [k] __get_user_pages
28.18% [kernel] [k] __mm_populate

The tasks in question use MCL_FUTURE but not MAP_POPULATE. These
tasks are immune to SIGKILL.

