Re: [PATCH for v5.9] mm/page_alloc: handle a missing case for memalloc_nocma_{save/restore} APIs

From: Joonsoo Kim
Date: Wed Aug 26 2020 - 01:12:59 EST


2020년 8월 25일 (화) 오후 6:43, Vlastimil Babka <vbabka@xxxxxxx>님이 작성:
>
>
> On 8/25/20 6:59 AM, js1304@xxxxxxxxx wrote:
> > From: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
> >
> > memalloc_nocma_{save/restore} APIs can be used to skip page allocation
> > on CMA area, but, there is a missing case and the page on CMA area could
> > be allocated even if APIs are used. This patch handles this case to fix
> > the potential issue.
> >
> > Missing case is an allocation from the pcplist. MIGRATE_MOVABLE pcplist
> > could have the pages on CMA area so we need to skip it if ALLOC_CMA isn't
> > specified.
> >
> > This patch implements this behaviour by checking allocated page from
> > the pcplist rather than skipping an allocation from the pcplist entirely.
> > Skipping the pcplist entirely would result in a mismatch between watermark
> > check and actual page allocation.
>
> Are you sure? I think a mismatch exists already. Pages can be on the pcplist but
> they are not considered as free in the watermark check. So passing watermark
> check means there should be also pages on free lists. So skipping pcplists would
> be safe, no?

You are right.

> > And, it requires to break current code
> > layering that order-0 page is always handled by the pcplist. I'd prefer
> > to avoid it so this patch uses different way to skip CMA page allocation
> > from the pcplist.
>
> Well it would be much simpler and won't affect most of allocations. Better than
> flushing pcplists IMHO.

Hmm...Still, I'd prefer my approach. There are two reasons. First,
layering problem
mentioned above. In rmqueue(), there is a code for MIGRATE_HIGHATOMIC.
As the name shows, it's for high order atomic allocation. But, after
skipping pcplist
allocation as you suggested, we could get there with order 0 request.
We can also
change this code, but, I'd hope to maintain current layering. Second,
a performance
reason. After the flag for nocma is up, a burst of nocma allocation
could come. After
flushing the pcplist one times, we can use the free page on the
pcplist as usual until
the context is changed.

How about my reasoning?

Thanks.