Re: khugepaged / firefox going wild in 3.18-rc

From: Vlastimil Babka
Date: Thu Nov 06 2014 - 08:03:44 EST


On 11/06/2014 01:39 PM, Norbert Preining wrote:
> Hi Vlastimil
>
> thanks for your answer.
>
> In the meantime I have tried rc3, too, with the same effects.
>
> Interestingly, once it goes into a bad state, every future approach
> does the same. I started shotwell (photo organizer) and it went into the
> same state (khugepaged / shotwell each using about 100% of CPu time).
>
> On Thu, 06 Nov 2014, Vlastimil Babka wrote:
>> Could be that another task holds the mmap_sem during THP allocation attempt on
>> its own page fault, and compaction goes in some kind of infinite loop. There are
>
> My feeling somehow is about the plugin-container in firefox ...
> (flashplayer or something similar, but I might be wrong!). With shotwell,
> I have no idea why.

plugin-container is different process than firefox, so it should show its CPU
consumption separately. If you see firefox, it's firefox binary itself.

>> I suggested testing a commit revert in one thread, and a possible fix in the
>> other. If you can reproduce this well, that would be very useful.
>
> Which commit are you talking about? I can easily revert some/all of what you
> want and do test runs.

OK, one possibility is to do (it should apply cleanly)
git revert e14c720efdd73c6d69cd8d07fa894bcd11fe1973

Then there's a patch at the end of this e-mail, which however I doubt would fix
the symptoms you describe.


>> khugepaged using CPU also points to either the address space scanning, or
>> compaction going wrong. Since 8b1645685ac it shouldn't hold mmap_sem during
>> compaction, but that still leaves page faulters to possibly hold it.
>
> So, do you mean I should try reverting 8b1645685ac?

No, not that one. That one should actually reduce the problem you see.

>> So yeah we would need the stacks of processes that do hog the CPU's, not those
>> that sleep. As David suggested, a /proc/pid/stack could work. Also can you
>> please provide /proc/zoneinfo ?
>
> Again, as I mentioned, I don't have /proc/pid/stack for any "pid", is
> this depending on some specific kerenl option?

Ah I missed that. Should be CONFIG_STACKTRACE to enable that.

> /proc/zoneinfo I have and can send you the next time.
>
> Thanks a lot and all the best

Great, thank you!
Vlastimil

> Norbert
>
> ------------------------------------------------------------------------
> PREINING, Norbert http://www.preining.info
> JAIST, Japan TeX Live & Debian Developer
> GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13
> ------------------------------------------------------------------------
>

------8<------