Re: [PATCH 1/2] mm, page_alloc: Keep pcp count and list contents in sync if struct page is corrupted

From: Mel Gorman
Date: Fri Dec 02 2016 - 04:49:43 EST


On Fri, Dec 02, 2016 at 09:12:17AM +0100, Michal Hocko wrote:
> On Fri 02-12-16 00:22:43, Mel Gorman wrote:
> > Vlastimil Babka pointed out that commit 479f854a207c ("mm, page_alloc:
> > defer debugging checks of pages allocated from the PCP") will allow the
> > per-cpu list counter to be out of sync with the per-cpu list contents
> > if a struct page is corrupted. This patch keeps the accounting in sync.
> >
> > Fixes: 479f854a207c ("mm, page_alloc: defer debugging checks of pages allocated from the PCP")
> > Signed-off-by: Mel Gorman <mgorman@xxxxxxx>
> > cc: stable@xxxxxxxxxxxxxxx [4.7+]
>
> I am trying to think about what would happen if we did go out of sync
> and cannot spot a problem. Vlastimil has mentioned something about
> free_pcppages_bulk looping for ever but I cannot see it happening right
> now.

free_pcppages_bulk can infinite loop if the page count is positive and
there are no pages. While I've only seen this during development, a
corrupted count loops here

do {
batch_free++;
if (++pindex == NR_PCP_LISTS)
pindex = 0;
list = &pcp->lists[pindex];
} while (list_empty(list));

It would only be seen in a situation where struct page corruption was
detected so it's rare.

--
Mel Gorman
SUSE Labs