Re: Kernel 2.2.14 OOM killer strikes.

From: Derek Martin (derek@cerberus.ne.mediaone.net)
Date: Fri Jul 07 2000 - 09:49:34 EST


Today, Jesse Pollard gleaned this insight:

> > > Based on what others have said in this thread, it sounds to me like the
> > > right answer is for the kernel to do something like keep track of how fast
> > > each process is allocating/using memory, and kill the highest users. And
> > > it also sounds like that's in the works. So I'll shut up now... :)
>
> Actually it isn't rare- The situation is equivalent to an X server:
>
> 1. Many windows get opened (fast initial allocation)...
> 2. client program does new window (fast initial allocation)
> 3. user selects different window.
> 4. client program attempts update (allocating memory into verge of OOM)
> 5. user selects option in window from #3 - pop-up window fails (OOM).

Er, the way I meant to word it was to track the RATE of allocation, which
suggests not only how often but also how much. If the X server or Window
manager were allocating the largest amount of memory at the fastest rate,
it seems to me that it IS the problem process.

In this case, it seems to me that we're probably talking about a desktop
system, and you probably don't really have enough VM to run X. If
this is a server, then don't run X on it and you won't have that problem.
As a system administrator, I have always considered it a bad idea to run X
on a server, as it is a complete waste of resources.

If it's a desktop system, and you *DO* have enough memory to run X, then
you have other problems -- most likely a runaway process. Care needs to
be taken when designing the algorithm that the runaway process be
identified and killed, rather than the X server.

I rather think that in that event, the runaway will be using memory at a
faster rate than the X server anyway. As you point out, X allocates
memory at the fastest rate when the X session first starts. This is also
usually when little else is going on on the system, since usually either
a) the system has just come up or b) the user has just logged in.

If you have users logging in to run X, AND you have other things going on
in the background (i.e. other users use it for remote login sessions or
whatever) then you, as the system administrator, probably need to rethink
how this machine is being used, and/or upgrade it.

In any event, killing the X server (and all processes spawned by it) seems
like a much safer option than killing system processes. Granted, it's
annoying to the user it happens to, but much less annoying than having
your home directory scribbled on because some random system process went
astray.

If your system OOM's frequently, then you need more memory, or maybe more
swap (though memory isn't SO much more expensive than disk these days, and
your system will obviously benefit from much faster RAM than slower disk).
There's no point in getting angry about the kernel killing your processes
off if you refuse to take the hint and upgrade the sucker.

You would also probably want to factor how recent the allocation is into
the equation too, somehow, in order to account for a process starting up
and allocating the last 5MB of memory, and then writing to it. I think
clearly in that case, that is the process that should get killed.

But I am a mere sysadmin, not a kernel hacker, so I'll leave that up to
the likes of you... ;)

Your comments are of course welcome, this is a learning process for me as
much as anything else.

-- 
---------------------------------------------------------------
Derek D. Martin              |  Unix/Linux Geek
ddm@MissionCriticalLinux.com |  derek@cerberus.ne.mediaone.net
---------------------------------------------------------------

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Jul 07 2000 - 21:00:21 EST