Re: [patch] mm, oom: prevent soft lockup on memcg oom for UP systems

From: Michal Hocko
Date: Thu Mar 12 2020 - 16:16:29 EST


On Thu 12-03-20 11:20:33, David Rientjes wrote:
> On Thu, 12 Mar 2020, Michal Hocko wrote:
>
> > > I think the changelog clearly states that we need to guarantee that a
> > > reclaimer will yield the processor back to allow a victim to exit. This
> > > is where we make the guarantee. If it helps for the specific reason it
> > > triggered in my testing, we could add:
> > >
> > > "For example, mem_cgroup_protected() can prohibit reclaim and thus any
> > > yielding in page reclaim would not address the issue."
> >
> > I would suggest something like the following:
> > "
> > The reclaim path (including the OOM) relies on explicit scheduling
> > points to hand over execution to tasks which could help with the reclaim
> > process.
>
> Are there other examples where yielding in the reclaim path would "help
> with the reclaim process" other than oom victims? This sentence seems
> vague.

In the context of UP and !PREEMPT this also includes IO flushers,
filesystems rely on workers and there are things I am very likely not
aware of. If you think this is vaague then feel free to reformulate.
All I really do care about is what the next paragraph is explaining.

> > Currently it is mostly shrink_page_list which yields CPU for
> > each reclaimed page. This might be insuficient though in some
> > configurations. E.g. when a memcg OOM path is triggered in a hierarchy
> > which doesn't have any reclaimable memory because of memory reclaim
> > protection (MEMCG_PROT_MIN) then there is possible to trigger a soft
> > lockup during an out of memory situation on non preemptible kernels
> > <PUT YOUR SOFT LOCKUP SPLAT HERE>
> >
> > Fix this by adding a cond_resched up in the reclaim path and make sure
> > there is a yield point regardless of reclaimability of the target
> > hierarchy.
> > "
> >

--
Michal Hocko
SUSE Labs