Re: Memory overcommitting (was Re: http://www.redhat.com/redhat/)

Theodore Y. Ts'o (tytso@MIT.EDU)
Thu, 20 Feb 1997 08:37:45 -0500


Date: Thu, 20 Feb 1997 11:26:42 +0200 (EET)
From: Tuukka Toivonen <tuukkat@stekt.oulu.fi>

As I have understand, Linux returns ALWAYS success when using malloc(),
because only reason why malloc() would fail, is memory overrun, and this
will never happen in malloc().

So my question is: is there any point in checking whether malloc() returned
NULL (failure) or success? Should i just start using the memory without
checking if the pointer is NULL?

It really is amazing how much mis-information has been generated on this
thread.....

malloc() does ***not*** always return success. The kernel makes a
hueristic-based check to see if there is enough memory for the malloc to
succeed _at_ _that_ _time_. Hence, it *is* important to check to see if
malloc() returns NULL, because it will.

This checking, however, is not, (someone has suggested) a complete
protection against memory over-commits. The check is implemented in the
sys_brk() call, which is used by malloc(), and it checks to see if there
is enough free pages, free swap pages, etc. to satisfy the increased
page size. However, it does not reserve this space right away, so a
program which called sys_brk() several times could increase its data
space size to the point where there would not be enough VM to satisfy
the process if those pages actually all got used.

This check, then, catches memory allocation problems in well-behaved
programs ---- 99.97% of the time. Is it possible for a malicious
programmer to devices a way to bring the system to its knees? Yes. But
this is generally true; denial of service attacks are very hard to
defend against. However, in general this check does prevent a system
which is under normal operating conditions from actually running into
problems.

One other point which is important for people to understand. Linux is
unlike other Unix systems in that read-only text pages are not swapped
out to the swap partition. Instead, Linux will just reclaim the memory
by throwing a text page away, since it can always page it in again from
the executable. Hence, the lossage mode when a Linux system gets into
VM trouble is not that processes get killed (although that is a
possibility if things get desperate enouogh); instead a Linux system
will tend to start thrashing very badly, since the text pages containing
the user executable code start getting thrown out in desperation as the
Linux VM system tries to find enough room for the data pages which can
only live in memory or in swap space.

The aforemention check in sys_brk() (and in mmap() for private writeable
mappings) is good enough to catch normal VM over-loading. How to handle
extreme cases of overloading, normally caused by malicious programs, is
a completely different question. We could implement a SIGDANGER type
approach as is used by AIX, but that won't save against a really
malicious user anyway.

= Ted