Re: Kernel 2.6.29 runs out of memory and hangs.

From: David Rientjes
Date: Fri Apr 24 2009 - 15:00:32 EST


On Fri, 24 Apr 2009, Zeno Davatz wrote:

> Apr 24 09:01:06 thinpower [1349923.693331] Out of memory: kill process
> 21490 (apache2) score 53801 or a child
> Apr 24 09:01:06 thinpower [1349923.693410] Killed process 21490 (apache2)
>

If your machine hangs here, then it's most likely because apache2 is
getting stuck in D state and cannot exit (and it has access to memory
reserves because of TIF_MEMDIE since it has been oom killed, so it may
deplete all memory).

I'm assuming that you're describing a machine hang as the inability to
ping it or ssh into it, not simply your apache server dying.

These types of livelocks are possible with the oom killer when a task
fails to exit, one possible way to fix that is to introduce an oom killer
timeout such that if a task fails to exit for a pre-defined period of
time, the oom killer will choose to kill another task in the hopes of
future memory freeing. The problem with that approach, however, is that
the hung task can consume an enormous amount of memory that will never be
freed.

> > If this is reproducible, I'd recommend enabling
> > /proc/sys/vm/oom_dump_tasks so that the oom killer will dump the tasklist
> > and show us what may be causing the livelock.
>
> Ok, how do I enable that? I will google for it.
>

You're right in your reply, you can enable it with

echo 1 > /proc/sys/vm/oom_dump_tasks

This will print the tasklist and some pertinent information alongside the
oom killer output you've already posted. It will give a better idea of
the memory usage on the machine and if killing a subsequent task would
actually help in this case.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/