Re: [PATCH] RFC: vmscan: add min_filelist_kbytes sysctl forprotecting the working set

From: Andrew Morton
Date: Thu Oct 28 2010 - 16:11:51 EST


On Thu, 28 Oct 2010 12:15:23 -0700
Mandeep Singh Baines <msb@xxxxxxxxxxxx> wrote:

> On ChromiumOS, we do not use swap.

Well that's bad. Why not?

> When memory is low, the only way to
> free memory is to reclaim pages from the file list. This results in a
> lot of thrashing under low memory conditions. We see the system become
> unresponsive for minutes before it eventually OOMs. We also see very
> slow browser tab switching under low memory. Instead of an unresponsive
> system, we'd really like the kernel to OOM as soon as it starts to
> thrash. If it can't keep the working set in memory, then OOM.
> Losing one of many tabs is a better behaviour for the user than an
> unresponsive system.
>
> This patch create a new sysctl, min_filelist_kbytes, which disables reclaim
> of file-backed pages when when there are less than min_filelist_bytes worth
> of such pages in the cache. This tunable is handy for low memory systems
> using solid-state storage where interactive response is more important
> than not OOMing.
>
> With this patch and min_filelist_kbytes set to 50000, I see very little
> block layer activity during low memory. The system stays responsive under
> low memory and browser tab switching is fast. Eventually, a process a gets
> killed by OOM. Without this patch, the system gets wedged for minutes
> before it eventually OOMs. Below is the vmstat output from my test runs.
>
> BEFORE (notice the high bi and wa, also how long it takes to OOM):

That's an interesting result.

Having the machine "wedged for minutes" thrashing away paging
executable text is pretty bad behaviour. I wonder how to fix it.
Perhaps simply declaring oom at an earlier stage.

Your patch is certainly simple enough but a bit sad. It says "the VM
gets this wrong, so lets just disable it all". And thereby reduces the
motivation to fix it for real.

But the patch definitely improves the situation in real-world
situations and there's a case to be made that it should be available at
least as an interim thing until the VM gets fixed for real. Which
means that the /proc tunable might disappear again (or become a no-op)
some time in the future.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/