Re: [PATCH] mm: memcg: Fix memcg reclaim soft lockup

From: Michal Hocko
Date: Wed Aug 26 2020 - 08:48:21 EST


On Wed 26-08-20 20:21:39, xunlei wrote:
> On 2020/8/26 下午8:07, Michal Hocko wrote:
> > On Wed 26-08-20 20:00:47, xunlei wrote:
> >> On 2020/8/26 下午7:00, Michal Hocko wrote:
> >>> On Wed 26-08-20 18:41:18, xunlei wrote:
> >>>> On 2020/8/26 下午4:11, Michal Hocko wrote:
> >>>>> On Wed 26-08-20 15:27:02, Xunlei Pang wrote:
> >>>>>> We've met softlockup with "CONFIG_PREEMPT_NONE=y", when
> >>>>>> the target memcg doesn't have any reclaimable memory.
> >>>>>
> >>>>> Do you have any scenario when this happens or is this some sort of a
> >>>>> test case?
> >>>>
> >>>> It can happen on tiny guest scenarios.
> >>>
> >>> OK, you made me more curious. If this is a tiny guest and this is a hard
> >>> limit reclaim path then we should trigger an oom killer which should
> >>> kill the offender and that in turn bail out from the try_charge lopp
> >>> (see should_force_charge). So how come this repeats enough in your setup
> >>> that it causes soft lockups?
> >>>
> >>
> >> should_force_charge() is false, the current trapped in endless loop is
> >> not the oom victim.
> >
> > How is that possible? If the oom killer kills a task and that doesn't
> > resolve the oom situation then it would go after another one until all
> > tasks are killed. Or is your task living outside of the memcg it tries
> > to charge?
> >
>
> All tasks are in memcgs. Looks like the first oom victim is not finished
> (unable to schedule), later mem_cgroup_oom()->...->oom_evaluate_task()
> will set oc->chosen to -1 and abort.

This shouldn't be possible for too long because oom_reaper would
make it invisible to the oom killer so it should proceed. Also
mem_cgroup_out_of_memory takes a mutex and that is an implicit
scheduling point already.

Which kernel version is this?

And just for the clarification. I am not against the additional
cond_resched. That sounds like a good thing in general because we do
want to have a predictable scheduling during reclaim which is
independent on reclaimability as much as possible. But I would like to
drill down to why you are seeing the lockup because those shouldn't
really happen.

--
Michal Hocko
SUSE Labs