Re: Page cache patch - 2'nd version

Dr. Werner Fink (werner@suse.de)
Tue, 1 Jul 1997 13:30:57 +0200


> Ekkk! No!
>
> This changes the ageing of named (mapped) pages to be very aggresive,
> which will result in many more page faults and file-system read/writes.
> The faults may be hard (that is all mappings will have been undone,
> and the page removed from the cache, so it will have to be re-read from
> store), or soft (that is the page has survived reaping, and is still in
> the page-cache, but this is still expensive).
> The only guard left against reaping named (mapped) pages will be the
> 'young' attribute of the PTE.
>
> Yes, it will mean that named (mapped) pages can be reaped easier, which
> for some usage patterns will be a win (mainly those pages with a short
> time in a working set. A reasonable example is a compiling a kernel).
> And it will save CPU cycles in swap_out() - less deadend searches.
>
> However, if you compile a kernel on a box where users are (say) using
> bash/vi, they will quickly find their working set of named pages reduced
> (and quickly reduced again, if it does happen to grow). This is not very
> friendly.

Do you have run some tests? I'm not able to re-produce your statement.
With the change of Krzysztof Strasburger I have much less I/O due swapping
under high load and the system much more responsive.

>
> If you want pages to be reaped easier, use the page-ageing tunables!

Maybe it's better to use age_page() to avoid aggressive aging and have a
better compromise between a responsive system and the current used cache
pages.

>
> NOTE: Named pages are given an initial age of PAGE_AGE_VALUE (see
> pagemap.h - this is not a soft-tunable), which is 16. The configured
> default age for anonymous (and shm()) pages is PAGE_INITIAL_AGE, which is
> 20.

After re-reading include/linux/swapctl.h and mm/swap.h I found that
PAGE_INITIAL_AGE (swap_control.sc_page_initial_age) is set to 3 and
MAX_PAGE_AGE (swap_control.sc_max_page_age) is set to 20.

>
> If swap is configured as a device, as opposed to a file, it is faster to
> read/write an anonymous-page to/from swap than a named-page to/from the
> file-system (there is less code overhead). So there should be a slight
> preference to reap anonymous pages (they should be given a smaller initial
> page age), but in the current implementation this is not true.
> Why? The swap-device does not perform readahead. To do so, 'related'
> anonymous pages (that is pages from the same vm_area) would need to be
> written down in clusters - treated as a large page _when_ it makes sense.
>
> Memory management does need to be made more flexible in Linux. There
> are already a few patches around. Unfortunately, to go much further MM
> needs a re-work.

A tested solution is needed for a working 2.0.31 :-)
It should be possible to use the system during and after piping a big file
to /dev/null or making a `make -j' in the kernel tree or run some other stress
tests.

Werner