Re: [RFC][Patch v12 1/2] mm: page_reporting: core infrastructure

From: Alexander Duyck
Date: Tue Aug 13 2019 - 19:14:50 EST


On Tue, Aug 13, 2019 at 3:34 AM David Hildenbrand <david@xxxxxxxxxx> wrote:
>
> >>>> +static int process_free_page(struct page *page,
> >>>> + struct page_reporting_config *phconf, int count)
> >>>> +{
> >>>> + int mt, order, ret = 0;
> >>>> +
> >>>> + mt = get_pageblock_migratetype(page);
> >>>> + order = page_private(page);
> >>>> + ret = __isolate_free_page(page, order);
> >>>> +
> >> I just started looking into the wonderful world of
> >> isolation/compaction/migration.
> >>
> >> I don't think saving/restoring the migratetype is correct here. AFAIK,
> >> MOVABLE/UNMOVABLE/RECLAIMABLE is just a hint, doesn't mean that e.g.,
> >> movable pages and up in UNMOVABLE or ordinary kernel allocations on
> >> MOVABLE. So that shouldn't be an issue - I guess.
> >>
> >> 1. You should never allocate something that is no
> >> MOVABLE/UNMOVABLE/RECLAIMABLE. Especially not, if you have ISOLATE or
> >> CMA here. There should at least be a !is_migrate_isolate_page() check
> >> somewhere
> >>
> >> 2. set_migratetype_isolate() takes the zone lock, so to avoid racing
> >> with isolation code, you have to hold the zone lock. Your code seems to
> >> do that, so at least you cannot race against isolation.
> >>
> >> 3. You could end up temporarily allocating something in the
> >> ZONE_MOVABLE. The pages you allocate are, however, not movable. There
> >> would have to be a way to make alloc_contig_range()/offlining code
> >> properly wait until the pages have been processed. Not sure about the
> >> real implications, though - too many details in the code (I wonder if
> >> Alex' series has a way of dealing with that)
> >>
> >> When you restore the migratetype, you could suddenly overwrite e.g.,
> >> ISOLATE, which feels wrong.
> >
> >
> > I was triggering an occasional CPU stall bug earlier, with saving and restoring
> > the migratetype I was able to fix it.
> > But I will further look into this to figure out if it is really required.
> >
>
> You should especially look into handling isolated/cma pages. Maybe that
> was the original issue. Alex seems to have added that in his latest
> series (skipping isolated/cma pageblocks completely) as well.

So as far as skipping isolated pageblocks, I get the reason for
skipping isolated, but why would we need to skip CMA? I had made the
change I did based on comments you had made earlier. But while working
on some of the changes to address isolation better and looking over
several spots in the code it seems like CMA is already being used as
an allocation fallback for MIGRATE_MOVABLE. If that is the case
wouldn't it make sense to allow pulling pages and reporting them while
they are in the free_list?