Re: [PATCH] mm/vmscan: add vm_swappiness configuration knobs

From: Michal Hocko
Date: Thu Mar 12 2020 - 09:26:54 EST


On Thu 12-03-20 12:54:19, Ivan Teterevkov wrote:
> On Thurs, 12 Mar 2020, Michal Hocko wrote:
>
> > On Wed 11-03-20 17:45:58, Ivan Teterevkov wrote:
> > > This patch adds a couple of knobs:
> > >
> > > - The configuration option (CONFIG_VM_SWAPPINESS).
> > > - The command line parameter (vm_swappiness).
> > >
> > > The default value is preserved, but now defined by CONFIG_VM_SWAPPINESS.
> > >
> > > Historically, the default swappiness is set to the well-known value
> > > 60, and this works well for the majority of cases. The vm_swappiness
> > > is also exposed as the kernel parameter that can be changed at runtime too,
> > e.g.
> > > with sysctl.
> > >
> > > This approach might not suit well some configurations, e.g.
> > > systemd-based distros, where systemd is put in charge of the cgroup
> > > controllers, including the memory one. In such cases, the default
> > > swappiness 60 is copied across the cgroup subtrees early at startup,
> > > when systemd is arranging the slices for its services, before the
> > > sysctl.conf or tmpfiles.d/*.conf changes are applied.
> > >
> > > One could run a script to traverse the cgroup trees later and set the
> > > desired memory.swappiness individually in each occurrence when the
> > > runtime is set up, but this would require some amount of work to
> > > implement properly. Instead, why not set the default swappiness as early as
> > possible?
> >
> > I have to say I am not a great fan of more tunning for swappiness as this is quite
> > a poor tunning for many years already. It essentially does nothing in many cases
> > because the reclaim process ignores to value in many cases (have a look a
> > get_scan_count. I have seen quite some reports that setting a specific value for
> > vmswappiness didn't make any change. The knob itself has a terrible semantic to
> > begin with because there is no way to express I really prefer to swap rather than
> > page cache reclaim.
> >
> > This all makes me think that swappiness is a historical mistake that we should
> > rather make obsolete than promote even further.
>
> Absolutely agree, the semantics of the vm_swappiness is perplexing.
> Moreover, the same get_scan_count treats vm_swappiness and cgroups
> memory.swappiness differently, in particular, 0 disables the memcg swap.
>
> Certainly, the patch adds some additional exposure to a parameter that
> is not trivial to tackle but it's already getting created with a magic
> number which is also confusing. Is there any harm to be done by the patch
> considering the already existing sysctl interface to that knob?

Like any other config option/kernel parameter. It is adding the the
overall config space size problem and unless this is really needed I
would rather not make it worse.
--
Michal Hocko
SUSE Labs