Re: Kernel 2.2.14 OOM killer strikes.

From: Andreas Dilger (adilger@turbolabs.com)
Date: Fri Jul 07 2000 - 12:10:48 EST


Claudio Martins writes:
> [re SIGDANGER being sent when we are nearly at OOM]
>
> The right thing? I don't think so. This doesn't fix anything. Processes
> will still have to exit!

Not necessarily, since SIGDANGER gives programs with handlers a chance to
free up some memory before the situation becomes critical. For example,
Netscape has a 16MB memory cache on my system that it can free at the
drop of a hat, if it had a chance to do so. Xterms, the X server, mpg123,
and many other programs have buffers that are not critical to their
operation that can be freed.

> How can this be tolerated in a multiuser environement? Someone starts a
> memory hog process and the other users in the system read in their terminal:
>
> got SIGDANGER signal: exiting...

It isn't SIGDANGER that kills a process (since the signal is ignored by
default), but rather the kernel uses the information that a program with
a SIGDANGER handler are "more important" than ones without a handler.
It is the OOM killer that kills the processes, just like now - but
SIGDANGER at least gives you a chance to help out, either by freeing
memory and/or by doing a graceful shutdown/saving files before being killed.

If the memory hog process does not have a SIGDANGER handler, then it
will be killed before init/syslog/etc if they DO have a SIGDANGER handler.
If there is truly a malicious user, there are many bad things they can do,
and OOM is only one of them.

> IMO you're only disguising the problem with this aproach, not fixing it.

Somewhat true, but in AIX the rationale for SIGDANGER is that there are
legitimate uses for memory overcommit (like scientific computing), and
having SIGDANGER is still useful, so that you know you are starting to
run out of VM, and deal with it properly.

> Per user resource accounting is surely the right thing, and has been
> proven to work on other UNIX systems.

It would be interesting to have some statistics on what real overcommit
usage is like right now, since any fork/exec will temporarily use 2x the
VM of the original process. For example, Netscape is a big program, and
it forks a DNS handler at startup, so if you limit your users to a
"reasonable" amount of VM, they might not be able to run Netscape, even
though after the exec is completed Netscape+DNS handler do exceed the VM
limit you imposed.

Cheers, Andreas

-- 
Andreas Dilger  \ "If a man ate a pound of pasta and a pound of antipasto,
                 \  would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/               -- Dogbert

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Jul 07 2000 - 21:00:21 EST