Re: 2.6.32.5 regression: page allocation failure. order:1,

From: Mel Gorman
Date: Wed Jan 27 2010 - 07:08:41 EST


On Tue, Jan 26, 2010 at 09:13:27PM -0500, Mark Lord wrote:
> I recently upgraded our 24/7 server from 2.6.31.5 to 2.6.32.5.
>
> Now, suddenly the logs are full of "page allocation failure. order:1",
> and the odd "page allocation failure. order:4" failures.
>
> Wow. WTF happened in 2.6.32 ???
>

There was one bug related to MIGRATE_RESERVE that might be affecting
you. It reported as impacting swap-orientated workloads but it could
easily affect drivers that depend on high-order atomic allocations.
Unfortunately, the fix is not signed-off yet but I expect it to make its
way towards mainline when it is.

Here is the patch with a slightly-altered changelog. Can you test if it
makes a difference please?

==== CUT HERE ====
From: Hugh Dickins <hugh.dickins@xxxxxxxxxxxxx>
Subject: Fix 2.6.32 slowdown in heavy swapping

There is a problem with simply building kernels as part of a tmpfs loop
swapping tests, and it's only obvious on the PowerPC G5. The problem
is that those swapping builds run about 20% slower in 2.6.32 than
2.6.31 (and look as if they run increasingly slowly, though I'm not
certain of that); and surprisingly it bisected down to your commit
5f8dcc21211a3d4e3a7a5ca366b469fb88117f61 page-allocator: split per-cpu list
into one-list-per-migrate-type

The problem was down to MIGRATE_RESERVE pages are being put on the
MIGRATE_MOVABLE list, then freed as MIGRATE_MOVABLE. While it is not clear
why this has such a severe impact, it may be down to how many short-lived
high-order allocations are taking place. On machines making large numbers of
short-lived-high-order allocations, they may be depending on MIGRATE_RESERVE
to allocate in a timely fashion. In the case where they are GFP_ATOMIC,
they may be depending on MIGRATE_RESERVE to just work.

The simplest, straight bugfix, patch is the one below: rely on
page_private instead of migratetype when freeing.

Unfortunately-not-signed-off-by: Hugh Dickins <hugh.dickins@xxxxxxxxxxxxx>
Acked-by: Mel Gorman <mel@xxxxxxxxx>

---
mm/page_alloc.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

--- 2.6.33-rc1/mm/page_alloc.c 2009-12-18 11:42:54.000000000 +0000
+++ linux/mm/page_alloc.c 2009-12-20 19:10:50.000000000 +0000
@@ -555,8 +555,9 @@ static void free_pcppages_bulk(struct zo
page = list_entry(list->prev, struct page, lru);
/* must delete as __free_one_page list manipulates */
list_del(&page->lru);
- __free_one_page(page, zone, 0, migratetype);
- trace_mm_page_pcpu_drain(page, 0, migratetype);
+ /* MIGRATE_MOVABLE list may include MIGRATE_RESERVEs */
+ __free_one_page(page, zone, 0, page_private(page));
+ trace_mm_page_pcpu_drain(page, 0, page_private(page));
} while (--count && --batch_free && !list_empty(list));
}
spin_unlock(&zone->lock);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/