Re: [RFC][PATCH] avoid swapping out with swappiness==0

From: KOSAKI Motohiro
Date: Wed Mar 07 2012 - 12:19:17 EST


On 3/5/2012 4:56 PM, Johannes Weiner wrote:
> On Fri, Mar 02, 2012 at 12:36:40PM -0500, Satoru Moriya wrote:
>> Sometimes we'd like to avoid swapping out anonymous memory
>> in particular, avoid swapping out pages of important process or
>> process groups while there is a reasonable amount of pagecache
>> on RAM so that we can satisfy our customers' requirements.
>>
>> OTOH, we can control how aggressive the kernel will swap memory pages
>> with /proc/sys/vm/swappiness for global and
>> /sys/fs/cgroup/memory/memory.swappiness for each memcg.
>>
>> But with current reclaim implementation, the kernel may swap out
>> even if we set swappiness==0 and there is pagecache on RAM.
>>
>> This patch changes the behavior with swappiness==0. If we set
>> swappiness==0, the kernel does not swap out completely
>> (for global reclaim until the amount of free pages and filebacked
>> pages in a zone has been reduced to something very very small
>> (nr_free + nr_filebacked < high watermark)).
>>
>> Any comments are welcome.
>
> Last time I tried that (getting rid of sc->may_swap, using
> !swappiness), it was rejected it as there were users who relied on
> swapping very slowly with this setting.
>
> KOSAKI-san, do I remember correctly? Do you still think it's an
> issue?
>
> Personally, I still think it's illogical that !swappiness allows
> swapping and would love to see this patch go in.

Thank you. I brought back to memory it. Unfortunately DB folks are still
mainly using RHEL5 generation distros. At that time, swapiness=0 doesn't
mean disabling swap.

They want, "don't swap as far as kernel has any file cache page". but linux
don't have such feature. then they used swappiness for emulate it. So, I
think this patch clearly make userland harm. Because of, we don't have an
alternative way.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/