Re: [PATCH RFC tip/core/rcu] Add shrinker to shift to fast/inefficient GP mode

From: Johannes Weiner
Date: Thu May 07 2020 - 13:00:26 EST


On Wed, May 06, 2020 at 05:55:35PM -0700, Andrew Morton wrote:
> On Wed, 6 May 2020 17:42:40 -0700 "Paul E. McKenney" <paulmck@xxxxxxxxxx> wrote:
>
> > This commit adds a shrinker so as to inform RCU when memory is scarce.
> > RCU responds by shifting into the same fast and inefficient mode that is
> > used in the presence of excessive numbers of RCU callbacks. RCU remains
> > in this state for one-tenth of a second, though this time window can be
> > extended by another call to the shrinker.

We may be able to use shrinkers here, but merely being invoked does
not carry a reliable distress signal.

Shrinkers get invoked whenever vmscan runs. It's a useful indicator
for when to age an auxiliary LRU list - test references, clear and
rotate or reclaim stale entries. The urgency, and what can and cannot
be considered "stale", is encoded in the callback frequency and scan
counts, and meant to be relative to the VM's own rate of aging: "I've
tested X percent of mine for recent use, now you go and test the same
share of your pool." It doesn't translate well to other
interpretations of the callbacks, although people have tried.

> > If it proves feasible, a later commit might add a function call directly
> > indicating the end of the period of scarce memory.
>
> (Cc David Chinner, who often has opinions on shrinkers ;))
>
> It's a bit abusive of the intent of the slab shrinkers, but I don't
> immediately see a problem with it. Always returning 0 from
> ->scan_objects might cause a problem in some situations(?).
>
> Perhaps we should have a formal "system getting low on memory, please
> do something" notification API.

It's tricky to find a useful definition of what low on memory
means. In the past we've used sc->priority cutoffs, the vmpressure
interface (reclaimed/scanned - reclaim efficiency cutoffs), oom
notifiers (another reclaim efficiency cutoff). But none of these
reliably capture "distress", and they vary highly between different
hardware setups. It can be hard to trigger OOM itself on fast IO
devices, even when the machine is way past useful (where useful is
somewhat subjective to the user). Userspace OOM implementations that
consider userspace health (also subjective) are getting more common.

> How significant is this? How much memory can RCU consume?

I think if rcu can end up consuming a significant share of memory, one
way that may work would be to do proper shrinker integration and track
the age of its objects relative to the age of other allocations in the
system. I.e. toss them all on a clock list with "new" bits and shrink
them at VM velocity. If the shrinker sees objects with new bit set,
clear and rotate. If it sees objects without them, we know rcu_heads
outlive cache pages etc. and should probably cycle faster too.