Re: khugepaged / firefox going wild in 3.18-rc

From: Vlastimil Babka
Date: Thu Nov 06 2014 - 07:25:54 EST


On 11/05/2014 01:20 AM, David Rientjes wrote:
> On Wed, 5 Nov 2014, Norbert Preining wrote:
>
>> Hi David,
>>
>> one more thing, attached dmesg output with some page faults,
>> maybe this is connected?
>>
>
> Hmm, I'm not aware of any mm->mmap_sem starvation issues in 3.18-rc, maybe
> this is a duplicate of another issue that someone has reported that I
> haven't seen. The lengthy output of echo t > /proc/sysrq-trigger should
> give a clue as to what is holding it or perhaps this is a more generic
> rwsem issue.

Could be that another task holds the mmap_sem during THP allocation attempt on
its own page fault, and compaction goes in some kind of infinite loop. There are
two other threads that look similar:

http://article.gmane.org/gmane.linux.kernel.mm/124451/match=isolate_freepages_block+very+high+intermittent+overhead

https://lkml.org/lkml/2014/11/4/144

I suggested testing a commit revert in one thread, and a possible fix in the
other. If you can reproduce this well, that would be very useful.

khugepaged using CPU also points to either the address space scanning, or
compaction going wrong. Since 8b1645685ac it shouldn't hold mmap_sem during
compaction, but that still leaves page faulters to possibly hold it.

So yeah we would need the stacks of processes that do hog the CPU's, not those
that sleep. As David suggested, a /proc/pid/stack could work. Also can you
please provide /proc/zoneinfo ?

Thanks,
Vlastimil

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/