Re: VM fixes [4/4]

From: Andrea Arcangeli
Date: Sun Jan 02 2005 - 10:55:11 EST


On Tue, Dec 28, 2004 at 10:42:40AM +0100, Thomas Gleixner wrote:
> On Mon, 2004-12-27 at 08:38 -0500, Rik van Riel wrote:
> > On Fri, 24 Dec 2004, Andrea Arcangeli wrote:
> >
> > > --- x/mm/oom_kill.c.orig 2004-12-24 17:53:50.807536152 +0100
> > > +++ x/mm/oom_kill.c 2004-12-24 18:01:19.903263224 +0100
> > > @@ -45,18 +45,30 @@
> > > unsigned long badness(struct task_struct *p, unsigned long uptime)
> > > {
> >
> > > /*
> > > + * Processes which fork a lot of child processes are likely
> > > + * a good choice. We add the vmsize of the childs if they
> > > + * have an own mm. This prevents forking servers to flood the
> > > + * machine with an endless amount of childs
> > > + */
> >
> > I'm not sure about this one. You'll end up killing the
> > parent httpd and sshd, instead of letting them hang around
> > so the system can recover by itself after the memory use
> > spike is over.
>
> The selection is adding the child VM size, but the killer itself kills a
> child process first, so the parent is not the one which is killed in the
> first place.

The other part of Thomas's change is this one:

+static struct mm_struct *oom_kill_process(task_t *p)
+{
+ struct mm_struct *mm;
+ struct task_struct *c;
+ struct list_head *tsk;
+
+ /* Try to kill a child first */
+ list_for_each(tsk, &p->children) {
+ c = list_entry(tsk, struct task_struct, sibling);
+ if (c->mm == p->mm)
+ continue;
+ mm = oom_kill_task(c);
+ if (mm)
+ return mm;
+ }
+ return oom_kill_task(p);
+}

Thomas's changes worked better than previous code so far, he can clearly
identify forkbombs or services spread across multiple processes.
Without these changes it was trivial to fool the oom killer and lead to
sshd and other services being killed, so his changes makes lots of sense
to me. It seems certainly better than the current code.

Actually the only question I made to him is why he doesn't kill the
"parent" instead of adding the above code, that is what is being
discussed here, but I was suggesting it as a good thing, not as a bad
thing. If we have a fork bomb killing the master parent would be
optimal. What he does above by killing the childs first is a lot more
conservative and I'm fine with it as well.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/