Re: 2.6.32.5 regression: page allocation failure. order:1,

From: Mark Lord
Date: Wed Jan 27 2010 - 23:17:41 EST


Mark Lord wrote:
Mel Gorman wrote:
On Tue, Jan 26, 2010 at 09:13:27PM -0500, Mark Lord wrote:
I recently upgraded our 24/7 server from 2.6.31.5 to 2.6.32.5.

Now, suddenly the logs are full of "page allocation failure. order:1",
and the odd "page allocation failure. order:4" failures.

Wow. WTF happened in 2.6.32 ???


There was one bug related to MIGRATE_RESERVE that might be affecting
you. It reported as impacting swap-orientated workloads but it could
easily affect drivers that depend on high-order atomic allocations.
Unfortunately, the fix is not signed-off yet but I expect it to make its
way towards mainline when it is.

Here is the patch with a slightly-altered changelog. Can you test if it
makes a difference please?
..

We don't like to reboot our 24/7 server very often,
and certainly not for debugging buggy kernels.

It's rock solid again with 2.6.31.12 on it now.

The defining characteristic of that machine, is that it has only 512MB
of physical RAM. So perhaps I'll try booting a different machine here
with mem=512M and see how that behaves. If the problem shows up on that,
then I'll try the patch.
..

Sod it. 2.6.32 is simply too broken for us here on 32-bit non-SMP.

Attempting to boot a 32-bit kernel with "nosmp mem=512M" on my notebook
locks up at boot time with several repeated messages like this:

request_module: runaway loop modprobe binfmt_464c

Useless kernel on 32-bit. I hope 2.6.33 ends up less buggy.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/