Re: I have a blaze of 353 page allocation failures, all alike

From: Christoph Lameter
Date: Tue Apr 12 2011 - 14:08:18 EST


On Tue, 12 Apr 2011, Peter Kruse wrote:

Hello,

On 02/24/2011 01:01 PM, Peter Kruse wrote:
> it took a while to find a date for a reboot... Unfortunately
> it was not possible to get the early boot messages with the
> kernel 2.6.32.23 since the compiled in log buffer is too
> small. So we installed as you suggested a more recent kernel
> 2.6.32.29 with a bigger log buffer, I attach the dmesg
> of that, and hope that the information in there is useful.
> We will keep an eye on that server with the newer kernel
> to see if the allocation failures appear again.

the server was running for a few without any more allocation
failures with kernel 2.6.32.29 but at one point the server
stopped responding, it was still possible for a while to
get a login, and trying to kill some processes but that
didn't succeed. But after that even login was
no longer possible so we had to reset it.
I attach the call trace, I hope you can find out what is
the problem.

The problem maybe that you have lots and lots of SCSI devices which
consume ZONE_DMA memory for their control structures. I guess that is
oversubscribing the 16M zone.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/