Re: [PATCH] mm/memory_hotplug: drain per-cpu pages again during memory offline

From: Vlastimil Babka
Date: Wed Sep 02 2020 - 10:56:40 EST


On 9/2/20 4:26 PM, Pavel Tatashin wrote:
> On Wed, Sep 2, 2020 at 10:08 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
>>
>> >
>> > Thread#1 - continue
>> > free_unref_page_commit
>> > migratetype = get_pcppage_migratetype(page);
>> > // get old migration type
>> > list_add(&page->lru, &pcp->lists[migratetype]);
>> > // add new page to already drained pcp list
>> >
>> > Thread#2
>> > Never drains pcp again, and therefore gets stuck in the loop.
>> >
>> > The fix is to try to drain per-cpu lists again after
>> > check_pages_isolated_cb() fails.
>>
>> But this means that the page is not isolated and so it could be reused
>> for something else. No?
>
> The page is in a movable zone, has zero references, and the section is
> isolated (i.e. set_pageblock_migratetype(page, MIGRATE_ISOLATE);) is
> set. The page should be offlinable, but it is lost in a pcp list as
> that list is never drained again after the first failure to migrate
> all pages in the range.

Yeah. To answer Michal's "it could be reused for something else" - yes, somebody
could allocate it from the pcplist before we do the extra drain. But then it
becomes "visible again" and the loop in __offline_pages() should catch it by
scan_movable_pages() - do_migrate_range(). And this time the pageblock is
already marked as isolated, so the page (freed by migration) won't end up on the
pcplist again.