Re: [PATCH] mm,oom: Bring OOM notifier callbacks to outside of OOM killer.

From: Michal Hocko
Date: Fri Jun 29 2018 - 05:04:29 EST


On Thu 28-06-18 14:31:05, Paul E. McKenney wrote:
> On Thu, Jun 28, 2018 at 01:39:42PM +0200, Michal Hocko wrote:
> > On Wed 27-06-18 07:31:25, Paul E. McKenney wrote:
> > > On Wed, Jun 27, 2018 at 09:22:07AM +0200, Michal Hocko wrote:
> > > > On Tue 26-06-18 10:03:45, Paul E. McKenney wrote:
> > > > [...]
> > > > > 3. Something else?
> > > >
> > > > How hard it would be to use a different API than oom notifiers? E.g. a
> > > > shrinker which just kicks all the pending callbacks if the reclaim
> > > > priority reaches low values (e.g. 0)?
> > >
> > > Beats me. What is a shrinker? ;-)
> >
> > This is a generich mechanism to reclaim memory that is not on standard
> > LRU lists. Lwn.net surely has some nice coverage (e.g.
> > https://lwn.net/Articles/548092/).
>
> "In addition, there is little agreement over what a call to a shrinker
> really means or how the called subsystem should respond." ;-)
>
> Is this set up using register_shrinker() in mm/vmscan.c? I am guessing

Yes, exactly. You are supposed to implement the two methods in struct
shrink_control

> that the many mentions of shrinker in DRM are irrelevant.
>
> If my guess is correct, the API seems a poor fit for RCU. I can
> produce an approximate number of RCU callbacks for ->count_objects(),
> but a given callback might free a lot of memory or none at all. Plus,
> to actually have ->scan_objects() free them before returning, I would
> need to use something like rcu_barrier(), which might involve longer
> delays than desired.`

Well, I am not yet sure how good fit this is because I still do not
understand the underlying problem your notifier is trying to solve. So I
will get back to this once that is settled.
>
> Or am I missing something here?
>
> > > More seriously, could you please point me at an exemplary shrinker
> > > use case so I can see what is involved?
> >
> > Well, I am not really sure what is the objective of the oom notifier to
> > point you to the right direction. IIUC you just want to kick callbacks
> > to be handled sooner under a heavy memory pressure, right? How is that
> > achieved? Kick a worker?
>
> That is achieved by enqueuing a non-lazy callback on each CPU's callback
> list, but only for those CPUs having non-empty lists. This causes
> CPUs with lists containing only lazy callbacks to be more aggressive,
> in particular, it prevents such CPUs from hanging out idle for seconds
> at a time while they have callbacks on their lists.
>
> The enqueuing happens via an IPI to the CPU in question.

I am afraid this is too low level for my to understand what is going on
here. What are lazy callbacks and why do they need any specific action
when we are getting close to OOM? I mean, I do understand that we might
have many callers of call_rcu and free memory lazily. But there is quite
a long way before we start the reclaim until we reach the OOM killer path.
So why don't those callbacks get called during that time period? How are
their triggered when we are not hitting the OOM path? They surely cannot
sit there for ever, right? Can we trigger them sooner? Maybe the
shrinker is not the best fit but we have a retry feedback loop in the page
allocator, maybe we can kick this processing from there.
--
Michal Hocko
SUSE Labs