Re: mmap() scalability in the presence of the MAP_POPULATE flag

From: Michel Lespinasse
Date: Sat Jan 05 2013 - 02:42:56 EST


On Fri, Jan 4, 2013 at 10:40 PM, Roman Dubtsov <dubtsov@xxxxxxxxx> wrote:
> On Fri, 2013-01-04 at 03:57 -0800, Michel Lespinasse wrote:
>> If this doesn't help, could you please send me your test case ? I
>> think you described enough of it that I would be able to reproduce it
>> given some time, but it's just easier if you send me a short C file :)
>
> It does not, the results are more or less the same. I've attached my
> testcase. It does map anonymous memory. It also uses OpenMP for
> threading because I'm lazy, so it requires passing -fopenmp to gcc and
> the number of threads it runs is defined via OMP_NUM_THREADS environment
> variable. There are also two macros that influence test's behavior:
>
> - POPULATE_VIA_LOOP -- makes the test populate memory using a loop
> - POPULATE_VIA_MMAP -- makes the test populate memory via MAP_POPULATE
>
> If none of the macros are defined, the test does not populate memory.

Heh, very interesting. As it turns out, the problem gets MUCH worse as
the number of threads increase.

We are populating the anon mapping with huge pages. In the
POPULATE_VIA_LOOP case, we are just taking a page fault every 2MB and
filling it up with a zeroed huge page - most of the runtime comes from
clearing the huge page.

In the POPULATE_VIA_MMAP, follow_page() is called at 4KB increment
addresses, and it takes the mm->page_table_lock 511 times out of 512
(that is, every time it falls within a huge page that's just been
populated). So all OMP_NUM_THREADS threads are constantly bouncing
over the mm->page_table_lock, and getting terrible performance as a
result.

Thanks for the report. I don't have a patch just now, but this does
seem very solvable.

--
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/