Re: Memory intensive processes

Richard B. Johnson (root@analogic.com)
Wed, 11 Dec 1996 16:14:43 -0500 (EST)


On Wed, 11 Dec 1996, Systemkennung Linux wrote:

> Not a good idea. Implementing your suggestion would mean that large parts
> the page tables which are are just truncated off would have to be allocated
> and memory. Just mmaping 1gb from /dev/zero would cost you over 1mb of
> *not swappable* memory for page tables. Aside of that allocation and filling
> consumes a lot of time.
>
> Better optimize the FFT ...
>
> Ralf
>
It is apparent that you didn't understand the well-proven concept. The
idea is that, unless actually written, the zero-filled memory doesn't have
to exist at all! This is called "demand-zero" paging and has been used
in VAXen forever.

In VAXen, a "page" was 512 bytes, on the i386++, a "page" is 4096 bytes.
That's 8 times more than the olden days. However, RAM costs far less than
1/8th what it cost in the days when VAXen were designed. To satisfy EVERY
user's requirement for initialized RAM, regardless of the actual
requirements, i.e., 16Mb for a FFT buffer, requires only ONE page of zero-
filled RAM for the ENTIRE SYSTEM. A SINGLE page of RAM can look to ALL
processes like 300 megabytes of RAM as long as a process doesn't write to it.

When an actual write occurs, the page that was filled in from "demand-zero"
is replaced with a page that can be dirtied. In fact, on the VAX, the
write is allowed to occur and then that page is removed from the demand-zero
list and becomes part of the processes' working set.

Whether or not you can "improve" a FFT is immaterial. At some time during
the program's operation, some kind of buffer allocation is made even if
declared as a local buffer on the stack. At that instant in time, the
buffer has not been written at all. At that instant in time, a real
buffer, containing writable RAM does not need to exist.

Further, for security reasons, VAXen do not allow a user to read OLD data
that might have been written by some other task at some time. Therefore,
even if you allocate a buffer locally on the stack, the buffer is zeroed.

Note that I don't advocate copying Digital's mistakes. However we can
learn a lot about performance by understanding how these things work. One
of the problems with "Digital security" is that if you extend a file, the
data is written to zero before you are allowed to access that file. That
takes a lot of CPU cycles and hurts performance.

If you look at things in small pieces, you will note that major portions
of a processes' RAM only has to exist for a short while. A lot of RAM
in the TEXT segment, i.e., program code, is only accessed ONCE upon
startup and never read again. This represents valuable RAM that could be
used for other processes. The kernel knows where every processes' program
counter is. If it exceeds an offset of, say, 0x2000, it knows damn well
that the first 0x1000 can be written to a swap device, freeing a whole
page. On VAXen, it knows that it doesn't even have to be written to the swap
device because it can always re-read the program file! In the unlikely
event that a program jumps back to its very beginning, that process
suffers the consequences of the kernel having to get that page from disk.
The other processes do not.

Remember that the kernel can make any page of physical RAM exist anywhere
within a processes' address space. The only requirement is that it can't
deal with less than a "page". The present kernel does this all the while.

Cheers,
Dick Johnson
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Richard B. Johnson
Project Engineer
Analogic Corporation
Voice : (508) 977-3000 ext. 3754
Fax : (508) 532-6097
Modem : (508) 977-6870
Ftp : ftp@boneserver.analogic.com
Email : rjohnson@analogic.com, johnson@analogic.com
Penguin : Linux version 2.1.14 on an i586 machine.
Warning : It's hard to remain at the trailing edge of technology.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-