Re: [PATCH v3] mm/page_alloc: fix counting of free pages after take off from buddy

From: HORIGUCHI NAOYA(堀口 直也)
Date: Wed May 26 2021 - 20:34:41 EST


On Wed, May 26, 2021 at 03:52:47PM +0800, Ding Hui wrote:
> Recently we found that there is a lot MemFree left in /proc/meminfo
> after do a lot of pages soft offline, it's not quite correct.
>
> Before Oscar rework soft offline for free pages [1], if we soft
> offline free pages, these pages are left in buddy with HWPoison
> flag, and NR_FREE_PAGES is not updated immediately. So the difference
> between NR_FREE_PAGES and real number of available free pages is
> also even big at the beginning.
>
> However, with the workload running, when we catch HWPoison page in
> any alloc functions subsequently, we will remove it from buddy,
> meanwhile update the NR_FREE_PAGES and try again, so the NR_FREE_PAGES
> will get more and more closer to the real number of available free pages.
> (regardless of unpoison_memory())
>
> Now, for offline free pages, after a successful call take_page_off_buddy(),
> the page is no longer belong to buddy allocator, and will not be
> used any more, but we missed accounting NR_FREE_PAGES in this situation,
> and there is no chance to be updated later.
>
> Do update in take_page_off_buddy() like rmqueue() does, but avoid
> double counting if some one already set_migratetype_isolate() on the
> page.
>
> [1]: commit 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages")
>
> Suggested-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
> Signed-off-by: Ding Hui <dinghui@xxxxxxxxxxxxxx>

Thank you very much.

Acked-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>

As for unpoison_memory(), I'm writing patches to fix unpoison (maybe takes a
few weeks to be posted) and that will add a reverse operation of
take_page_off_buddy() which simply calls __free_one_page(), so NR_FREE_PAGES
counter will also be handled correctly with the patches.