Re: OOM kernel behaviour

From: Mel Gorman
Date: Wed Dec 02 2009 - 10:55:48 EST


On Tue, Dec 01, 2009 at 11:26:37AM -0600, Christoph Lameter wrote:
> On Tue, 1 Dec 2009, David John wrote:
>
> > Here are three logs from three days. Log3.txt is today's log and the OOM
> > killer murdered Thunderbird as I was attempting to write this message.
> > The kernel config is also attached.
>
> Hmmm... This is all caused by the HIGHMEM zone freecount going beyond min
> which then triggers reclaim which for some reason fails (should not there
> is sufficient material there to reclaim). There is enough memory in the
> NORMAL zone. Wonder if something broke in 2.6.31 in reclaim? Mel?
>

I'm not aware of breakage of that level, nor do I believe the page
allocator problems are related to this bug.

However, I just took a look at the logs from the three days and I see
things like

Nov 25 23:58:53 avalanche kernel: Free swap = 0kB
Nov 25 23:58:53 avalanche kernel: Total swap = 2048248kB


Something on that system is leaking badly. Do something like

ps aux --sort vsz

and see what process has gone mental and is consuming all of swap. It's
possible that the OOM killer is triggering too easily but it's possible
that a delayed triggering of the OOM killer would have been just that -
a delay. Eventually all memory and all swap would be used.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/