Re: [PATCH for v5.9] mm/page_alloc: handle a missing case for memalloc_nocma_{save/restore} APIs

From: Mel Gorman
Date: Thu Aug 27 2020 - 10:00:32 EST


On Wed, Aug 26, 2020 at 02:12:44PM +0900, Joonsoo Kim wrote:
> > > And, it requires to break current code
> > > layering that order-0 page is always handled by the pcplist. I'd prefer
> > > to avoid it so this patch uses different way to skip CMA page allocation
> > > from the pcplist.
> >
> > Well it would be much simpler and won't affect most of allocations. Better than
> > flushing pcplists IMHO.
>
> Hmm...Still, I'd prefer my approach.

I prefer the pcp bypass approach. It's simpler and it does not incur a
pcp drain/refill penalty.

> There are two reasons. First,
> layering problem
> mentioned above. In rmqueue(), there is a code for MIGRATE_HIGHATOMIC.
> As the name shows, it's for high order atomic allocation. But, after
> skipping pcplist
> allocation as you suggested, we could get there with order 0 request.

I guess your concern is that under some circumstances that a request that
passes a watermark check could fail due to a highatomic reserve and to
an extent this is true. However, in that case the system is already low
on memory depending on the allocation context, the pcp lists may get
flushed anyway.

> We can also
> change this code, but, I'd hope to maintain current layering. Second,
> a performance
> reason. After the flag for nocma is up, a burst of nocma allocation
> could come. After
> flushing the pcplist one times, we can use the free page on the
> pcplist as usual until
> the context is changed.

It's not guaranteed because CMA pages could be freed between the nocma save
and restore triggering further drains due to a reschedule. Similarly,
a CMA allocation in parallel could refill with CMA pages on the per-cpu
list. While both cases are unlikely, it's more unpredictable than a
straight-forward pcp bypass.

I don't really see it as a layering violation of the API because all
order-0 pages go through the PCP lists. The fact that order-0 is serviced
from the pcp list is an internal implementation detail, the API doesn't
care.

--
Mel Gorman
SUSE Labs