Re: [PATCH 12/27] mm, vmscan: Make shrink_node decisions more node-centric

From: Mel Gorman
Date: Thu Jun 16 2016 - 10:48:06 EST


On Thu, Jun 16, 2016 at 03:35:15PM +0200, Vlastimil Babka wrote:
> On 06/09/2016 08:04 PM, Mel Gorman wrote:
> >Earlier patches focused on having direct reclaim and kswapd use data that
> >is node-centric for reclaiming but shrink_node() itself still uses too much
> >zone information. This patch removes unnecessary zone-based information
> >with the most important decision being whether to continue reclaim or
> >not. Some memcg APIs are adjusted as a result even though memcg itself
> >still uses some zone information.
> >
> >Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
>
> [...]
>
> >@@ -2372,21 +2374,27 @@ static inline bool should_continue_reclaim(struct zone *zone,
> > * inactive lists are large enough, continue reclaiming
> > */
> > pages_for_compaction = (2UL << sc->order);
> >- inactive_lru_pages = node_page_state(zone->zone_pgdat, NR_INACTIVE_FILE);
> >+ inactive_lru_pages = node_page_state(pgdat, NR_INACTIVE_FILE);
> > if (get_nr_swap_pages() > 0)
> >- inactive_lru_pages += node_page_state(zone->zone_pgdat, NR_INACTIVE_ANON);
> >+ inactive_lru_pages += node_page_state(pgdat, NR_INACTIVE_ANON);
> > if (sc->nr_reclaimed < pages_for_compaction &&
> > inactive_lru_pages > pages_for_compaction)
> > return true;
> >
> > /* If compaction would go ahead or the allocation would succeed, stop */
> >- switch (compaction_suitable(zone, sc->order, 0, 0)) {
> >- case COMPACT_PARTIAL:
> >- case COMPACT_CONTINUE:
> >- return false;
> >- default:
> >- return true;
> >+ for (z = 0; z <= sc->reclaim_idx; z++) {
> >+ struct zone *zone = &pgdat->node_zones[z];
> >+
> >+ switch (compaction_suitable(zone, sc->order, 0, 0)) {
>
> Using 0 for classzone_idx here was sort of OK when each zone was reclaimed
> separately, as a Normal allocation not passing appropriate classzone_idx
> (and thus subtracting lowmem reserve from free pages) means that a false
> COMPACT_PARTIAL (or COMPACT_CONTINUE) could be returned for e.g. DMA zone.
> It means a premature end of reclaim for this single zone, which is
> relatively small anyway, so no big deal (and we might avoid useless
> over-reclaim, when even reclaiming everything wouldn't get us above the
> lowmem_reserve).
>
> But in node-centric reclaim, such premature "return false" from a DMA zone
> stops reclaiming the whole node. So I think we should involve the real
> classzone_idx here.
>

Fair point although for compaction, it'll occur for a marginal corner
case. Premature allowed compaction for ZONE_DMA is unfortunate but
bizarre to think there would be a high-order allocation restricted to
just that zone too.

I'll pass in sc->reclaim_idx as it represents the allocating order. That
highlights that direct reclaim was not setting reclaim_idx but that's
been corrected.

--
Mel Gorman
SUSE Labs