Re: [RFC] New ideas for the OOM handler

From: Byron Stanoszek (gandalf@winds.org)
Date: Mon Oct 09 2000 - 20:25:38 EST


On Tue, 10 Oct 2000, Jochen Striepe wrote:

> Hi, a question regarding the OOM process killer...
>
> Hmm, sometimes daemon-like processes (e.g. web servers) only need root
> privileges to open a network port<1024 - you may start them as non-root
> if they do not need such a privileged port. Might be hard to sort them
> out...

That is very true. I run several daemons in non-superuser mode that would
become the victim of that.

This reminds me of an earlier post where I discussed that CPU Time should not
be factored in. I may have misunderstood (my apologies to Rik). What is more
important is how long the process has been running (daemons usually get started
first thing at bootup, versus running 'netscape' and shortly using up all
memory in 30 minutes).

Figuring in Time Since Process Creation can almost be a misguided way of doing
things. I don't know how many times I've restarted or upgraded 'named' on my
90-day-uptime system, just to change the configuration file or to back out
invalid serial numbers in DNS zone files.

However, if I have been running a 99% cpu-intensive mathematical modeling
program for the past 10 days, I wouldn't want it to get killed because it
allocated 50 MB in the first second of its life when the system had 250MB to
spare.

However, what if I started up netscape on day 1, ran tons of other processes
meanwhile, and only used netscape lightly day by day. After 90 days of light
usage, netscape might actually be using 150MB of ram now. Does netscape
(rightfully) get killed, or does my modeling program which only uses 50MB get
killed because I started it only 10 days ago instead of 90 days?

Neither process start date nor CPU Usage % can correctly detect which process
to kill over a period of 10 to 90 days, in this scenario. This is why I don't
like factoring in these two elements. The OOM killer will get this right 90% of
the time, maybe even 95%. But what about the sequential 'child worker' that was
forked off of the modeling program once every 5 hours?

I think there should be a better solution.

> What about a user-defined list of "wishes"? The administrator should be
> enabled to enforce that specific processes are to be terminated only as a
> last resource (syslogd), or that they should be killed first (netscape).
> Could that be done using some /proc interface - some lines, each
> containing a program name, and a modifier for the killing priority?

echo "init" > /proc/sys/oom-ignore
echo "httpd" >> /proc/sys/oom-ignore
echo "parallel-fft" >> /proc/sys/oom-ignore
 etc...

This is a very workable option. It allows the admin to define what is
"important" on his computer and tells the OOM killer to terminate at
last resort (or ignore completely).

I like it. Rik, what do you think?

> Just a thought. Hope my English is not too bad to make my thoughts
> clear... Sorry if this was discussed on l-k before - I do not have the
> time to read each posting on the list.
>
>
> Greetings from Germany,
>
> Jochen Striepe.
>
>

-- 
Byron Stanoszek                         Ph: (330) 644-3059
Systems Programmer                      Fax: (330) 644-8110
Commercial Timesharing Inc.             Email: bstanoszek@comtime.com

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Oct 15 2000 - 21:00:14 EST