Re: [patch 3/5] oom: less memdie

From: Nick Piggin
Date: Fri Oct 13 2006 - 02:38:43 EST


Andrew Morton wrote:

On Thu, 12 Oct 2006 16:10:01 +0200 (CEST)
Nick Piggin <npiggin@xxxxxxx> wrote:


Don't cause all threads in all other thread groups to gain TIF_MEMDIE
otherwise we'll get a thundering herd eating out memory reserve. This
may not be the optimal scheme, but it fits our policy of allowing just
one TIF_MEMDIE in the system at once.

Signed-off-by: Nick Piggin <npiggin@xxxxxxx>

Index: linux-2.6/mm/oom_kill.c
===================================================================
--- linux-2.6.orig/mm/oom_kill.c
+++ linux-2.6/mm/oom_kill.c
@@ -322,11 +322,12 @@ static int oom_kill_task(struct task_str

/*
* kill all processes that share the ->mm (i.e. all threads),
- * but are in a different thread group.
+ * but are in a different thread group. Don't let them have access
+ * to memory reserves though, otherwise we might deplete all memory.
*/
do_each_thread(g, q) {
if (q->mm == mm && q->tgid != p->tgid)
- __oom_kill_task(q, 1);
+ force_sig(SIGKILL, p);
} while_each_thread(g, q);



Curious. How much testing did you do of this stuff? I assume there were
some observed problems. What were they, and what was the observed effect
of these changes?


This change I actually didn't really test because I don't have any apps to speak
of which use multiple thread groups.

I stumbled on it by inspection when trying to fix the killing of OOM_DISABLE tasks.
Basically -- we don't set TIF_MEMDIE or boost the priority of *any* other thread in
our same group, so we shouldn't do it for *all* other threds of all other groups.

Consider an OOM situation, where there will likely be a lot of threads stuck in
__alloc_pages. If a large number of these suddenly get a big timeslice and full
access to memory reserves, they'll eat into more than we'd like.

Now I'm not sure that our current OOM killing / memory reserving scheme is
perfect -- indeed if we only allow a single TIF_MEMDIE thread at once, we can
get deadlocks. However, that's the direction we've chosen, and it seems to work
reasonably well. Mostly.

This is just an enforcement of that policy rather than a change in direction.
Criticism is always welcome though.

--

Send instant messages to your online friends http://au.messenger.yahoo.com -
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/