Re: [PATCH 1/9] mm: fix 100% CPU kswapd busyloop on unreclaimable nodes

From: Johannes Weiner
Date: Tue Mar 07 2017 - 13:55:08 EST


On Tue, Mar 07, 2017 at 11:17:02AM +0100, Michal Hocko wrote:
> On Mon 06-03-17 11:24:10, Johannes Weiner wrote:
> > @@ -3271,7 +3271,8 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int classzone_idx)
> > * Raise priority if scanning rate is too low or there was no
> > * progress in reclaiming pages
> > */
> > - if (raise_priority || !sc.nr_reclaimed)
> > + nr_reclaimed = sc.nr_reclaimed - nr_reclaimed;
> > + if (raise_priority || !nr_reclaimed)
> > sc.priority--;
> > } while (sc.priority >= 1);
> >
>
> I would rather not play with the sc state here. From a quick look at
> least
> /*
> * Fragmentation may mean that the system cannot be rebalanced for
> * high-order allocations. If twice the allocation size has been
> * reclaimed then recheck watermarks only at order-0 to prevent
> * excessive reclaim. Assume that a process requested a high-order
> * can direct reclaim/compact.
> */
> if (sc->order && sc->nr_reclaimed >= compact_gap(sc->order))
> sc->order = 0;
>
> does rely on the value. Wouldn't something like the following be safer?

Well, what behavior is correct, though? This check looks like an
argument *against* resetting sc.nr_reclaimed.

If kswapd is woken up for a higher order, this check sets a reclaim
cutoff beyond which it should give up on the order and balance for 0.

That's on the scope of the kswapd invocation. Applying this threshold
to the outcome of just the preceeding priority seems like a mistake.

Mel? Vlastimil?