Re: kswapd craziness in 3.7

From: Mel Gorman
Date: Wed Nov 28 2012 - 18:54:08 EST


On Wed, Nov 28, 2012 at 02:52:15PM -0800, Andrew Morton wrote:
> On Wed, 28 Nov 2012 10:13:59 +0000
> Mel Gorman <mgorman@xxxxxxx> wrote:
>
> > Based on the reports I've seen I expect the following to work for 3.7
> >
> > Keep
> > 96710098 mm: revert "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures"
> > ef6c5be6 fix incorrect NR_FREE_PAGES accounting (appears like memory leak)
> >
> > Revert
> > 82b212f4 Revert "mm: remove __GFP_NO_KSWAPD"
> >
> > Merge
> > mm: vmscan: fix kswapd endless loop on higher order allocation
> > mm: Avoid waking kswapd for THP allocations when compaction is deferred or contended
>
> "mm: Avoid waking kswapd for THP ..." is marked "I have not tested it
> myself" and when Zdenek tested it he hit an unexplained oom.
>

I thought Zdenek was testing with __GFP_NO_KSWAPD when he hit that OOM.
Further, when he hit that OOM, it looked like a genuine OOM. He had no
swap configured and inactive/active file pages were very low. Finally,
the free pages for Normal looked off and could also have been affected by
the accounting bug. I'm looking at https://lkml.org/lkml/2012/11/18/132
here. Are you thinking of something else?

I have not tested with the patch admittedly but Thorsten has and seemed
to be ok with it https://lkml.org/lkml/2012/11/23/276.

> > Johannes' patch should remove the necessity for __GFP_NO_KSWAPD revert but I
> > think we should also avoid waking kswapd for THP allocations if compaction
> > is deferred. Johannes' patch might mean that kswapd goes quickly go back
> > to sleep but it's still busy work.
> >
> > 3.6 is still known to be screwed in terms of THP because of the amount of
> > time it can spend in compaction after lumpy reclaim was removed. This is
> > my old list of patches I felt needed to be backported after 3.7 came out.
> > They are not tagged -stable, I'll be sending it to Greg manually.
> >
> > e64c523 mm: compaction: abort compaction loop if lock is contended or run too long
> > 3cc668f mm: compaction: move fatal signal check out of compact_checklock_irqsave
> > 661c4cb mm: compaction: Update try_to_compact_pages()kerneldoc comment
> > 2a1402a mm: compaction: acquire the zone->lru_lock as late as possible
> > f40d1e4 mm: compaction: acquire the zone->lock as late as possible
> > 753341a revert "mm: have order > 0 compaction start off where it left"
> > bb13ffe mm: compaction: cache if a pageblock was scanned and no pages were isolated
> > c89511a mm: compaction: Restart compaction from near where it left off
> > 6299702 mm: compaction: clear PG_migrate_skip based on compaction and reclaim activity
> > 0db63d7 mm: compaction: correct the nr_strict va isolated check for CMA
> >
> > Only Johannes' patch needs to be added to this list. kswapd is not woken
> > for THP in 3.6 but as it calls compaction for other high-order allocations
> > it still makes sense.
>
> Please identify "Johannes' patch"?

mm: vmscan: fix kswapd endless loop on higher order allocation

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/