Re: [PATCH] mm/memory_hotplug: drain per-cpu pages again during memory offline

From: Pavel Tatashin
Date: Wed Sep 02 2020 - 10:32:55 EST


> > > The fix is to try to drain per-cpu lists again after
> > > check_pages_isolated_cb() fails.
>
> Still trying to wrap my head around this but I think this is not a
> proper fix. It should be the page isolation to make sure no races are
> possible with the page freeing path.
>

As Bharata B Rao found in another thread, the problem was introduced
by this change:
c52e75935f8d: mm: remove extra drain pages on pcp list

So, the drain used to be tried every time with lru_add_drain_all();
Which, I think is excessive, as we start a thread per cpu to try to
drain and catch a rare race condition. With the proposed change we
drain again only when we find such a condition. Fixing it in
start_isolate_page_range means that we must somehow synchronize it
with the release_pages() which adds costs to runtime code, instead of
to hot-remove code.

Pasha