(2.2.15) Resource limit problems.

From: Scott A Crosby (crosby@qwes.math.cmu.edu)
Date: Fri May 26 2000 - 17:11:10 EST


Hello, I tend to run software systems like SML-NJ, or CMUCL. (These are
software-systems which use heap-style allocation, similar to JAVA.) They
can suddenly consume a large amount of memory without warning.

As linux doesn't handle this well, I need to have resource limits designed
to kill off these processes before they impact system stability.

I cannot do this..

For some reason, the DATA or RSS limits (ulimit -Sd or -Sm) will not catch
the runaway process.

Only the VM (ulimit -Sv) will catch a runaway sml-nj (Standard ML of New
Jersey) process: Here's an example strace:

old_mmap(NULL, 48431104, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE, 3,
                             0) = -1 ENOMEM (Cannot allocate memory)
/usr/local/sml: Error -- unable to map 48365568 bytes, errno = 12

SML-NJ allocates memory as it needs it, occasionally doubling its memory
consumption when it's doing a garbage collection.

The other program, CMUCL (Carnegie Mellon University Common Lisp) MMAP's
all of its heap before it uses it, when I run it under 'ulimit -Sv 40000':

old_mmap(0x10000000, 268431360, PROT_READ|PROT_WRITE|PROT_EXEC,
    MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM
         (Cannot allocate memory)

It's normal heap-allocation sequence is this:

old_mmap(0x10000000, 268431360, PROT_READ|PROT_WRITE|PROT_EXEC,
        MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x10000000
old_mmap(0x28000000, 268431360, PROT_READ|PROT_WRITE|PROT_EXEC,
        MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x28000000
old_mmap(0x48000000, 1879048192, PROT_READ|PROT_WRITE|PROT_EXEC,
        MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x48000000
old_mmap(0x38000000, 134213632, PROT_READ|PROT_WRITE|PROT_EXEC,
        MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x38000000
old_mmap(0x20000000, 134213632, PROT_READ|PROT_WRITE|PROT_EXEC,
        MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x20000000

As such, it obviously cannot be run under 'ulimit -Sv 40000', or any other
reasonable 'ulimit -Sv' limit.

What I wish is to be able to limit the number of pages that a program can
use, not allocate, so that I can catch runaway SML-NJ or CMUCL processes,
without impacting their normal operation.

Doing some grep's through my 2.2.15 code tree, RLIMIT_RSS seems to be used
in exactly one place, printing out ``/proc/this/stat''. RLIMIT_DATA seems
to only be checked with brk().

Any idea how to do get resource limits that will work? (Please CC any
replies.)

Scott

--
Table of testing results. Here, I test each system in a
runaway-situation under various limits:

SML CMUC -Sv 50000 kills won't start -Sd 40000 won't kill won't kill -Sm 40000 won't kill won't kill

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed May 31 2000 - 21:00:16 EST