Re: [mm 4.15-rc8] Random oopses under memory pressure.

From: Dave Hansen
Date: Wed Jan 17 2018 - 17:04:57 EST


On 01/17/2018 01:51 PM, Linus Torvalds wrote:
> In fact, it seems to be such a fundamental bug that I suspect I'm
> entirely wrong, and full of shit. So it's an interesting and not
> _obviously_ incorrect theory, but I suspect I must be missing
> something.

I'll just note that a few of the pfns I decoded were smack in the middle
of the zone, not near either the high or low end of ZONE_NORMAL where we
would expect this cross-zone stuff to happen.

But I guess we could get similar wonkiness where 'struct page' is
screwed up in so many different ways if during buddy joining you do:

list_del(&buddy->lru);

and 'buddy' is off in another zone for which you do not hold the
spinlock. If we are somehow missing some locking, or double-allocating
a page, something like this would help:

static inline void rmv_page_order(struct page *page)
{
+ WARN_ON_ONCE(!PageBuddy(page));
__ClearPageBuddy(page);
set_page_private(page, 0);
}