Re: [PATCH] mremap: enforce rmap src/dst vma ordering in case ofvma_merge succeeding in copy_vma

From: Nai Xia
Date: Fri Nov 04 2011 - 22:21:35 EST


On Fri, Nov 4, 2011 at 11:59 PM, Pawel Sikora <pluto@xxxxxxxx> wrote:
> On Friday 04 of November 2011 22:34:54 Nai Xia wrote:
>> On Fri, Nov 4, 2011 at 3:31 PM, Hugh Dickins <hughd@xxxxxxxxxx> wrote:
>> > On Mon, 31 Oct 2011, Andrea Arcangeli wrote:
>> >
>> >> migrate was doing a rmap_walk with speculative lock-less access on
>> >> pagetables. That could lead it to not serialize properly against
>> >> mremap PT locks. But a second problem remains in the order of vmas in
>> >> the same_anon_vma list used by the rmap_walk.
>> >
>> > I do think that Nai Xia deserves special credit for thinking deeper
>> > into this than the rest of us (before you came back): something like
>> >
>> > Issue-conceived-by: Nai Xia <nai.xia@xxxxxxxxx>
>>
>> Thanks! ;-)
>
> hi all,
>
> i'm still testing anon_vma_order_tail() patch. 10 days of heavy processing
> and machine is still stable but i've recorded some interesting thing:
>
> $ uname -a
> Linux hal 3.0.8-vs2.3.1-dirty #6 SMP Tue Oct 25 10:07:50 CEST 2011 x86_64 AMD_Opteron(tm)_Processor_6128 PLD Linux
> $ uptime
> Â16:47:44 up 10 days, Â4:21, Â5 users, Âload average: 19.55, 19.15, 18.76
> $ ps aux|grep migration
> root     6 Â0.0 Â0.0   Â0   0 ?    ÂS  ÂOct25  0:00 [migration/0]
> root     8 68.0 Â0.0   Â0   0 ?    ÂS  ÂOct25 9974:01 [migration/1]
> root    Â13 35.4 Â0.0   Â0   0 ?    ÂS  ÂOct25 5202:15 [migration/2]
> root    Â17 71.4 Â0.0   Â0   0 ?    ÂS  ÂOct25 10479:10 [migration/3]
> root    Â21 70.7 Â0.0   Â0   0 ?    ÂS  ÂOct25 10370:14 [migration/4]
> root    Â25 66.1 Â0.0   Â0   0 ?    ÂS  ÂOct25 9698:11 [migration/5]
> root    Â29 70.1 Â0.0   Â0   0 ?    ÂS  ÂOct25 10283:22 [migration/6]
> root    Â33 62.6 Â0.0   Â0   0 ?    ÂS  ÂOct25 9190:28 [migration/7]
> root    Â37 Â0.0 Â0.0   Â0   0 ?    ÂS  ÂOct25  0:00 [migration/8]
> root    Â41 97.7 Â0.0   Â0   0 ?    ÂS  ÂOct25 14338:30 [migration/9]
> root    Â45 29.2 Â0.0   Â0   0 ?    ÂS  ÂOct25 4290:00 [migration/10]
> root    Â49 68.7 Â0.0   Â0   0 ?    ÂS  ÂOct25 10081:38 [migration/11]
> root    Â53 98.7 Â0.0   Â0   0 ?    ÂS  ÂOct25 14477:25 [migration/12]
> root    Â57 70.0 Â0.0   Â0   0 ?    ÂS  ÂOct25 10272:57 [migration/13]
> root    Â61 69.7 Â0.0   Â0   0 ?    ÂS  ÂOct25 10232:29 [migration/14]
> root    Â65 70.9 Â0.0   Â0   0 ?    ÂS  ÂOct25 10403:09 [migration/15]
>
> wow, 71..241 hours in migration processes after 10 days of uptime?
> machine has 2 opteron nodes with 32GB ram paired with each processor.
> i suppose that it spends a lot of time on migration (processes + memory pages).

Hi PaweÅ, it seems to me an issue related to load balancing but might
not directly
related to this bug or even not related to abnormal page migration.
Can this be a scheduler & interrupts issue?

But oh, well, actually I never ever had touch a 16-core machine
and do heavy processing. So I cannot tell if this result is normal or not.

Maybe you should ask for a broader range of people?

BR,
Nai

>
> BR,
> PaweÅ.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/