Re: [PATCH] mm, oom: allow oom reaper to race with exit_mmap

From: Michal Hocko
Date: Thu Jul 27 2017 - 02:32:13 EST


On Wed 26-07-17 18:39:28, Andrea Arcangeli wrote:
> On Wed, Jul 26, 2017 at 07:45:33AM +0200, Michal Hocko wrote:
> > Yes, exit_aio is the only blocking call I know of currently. But I would
> > like this to be as robust as possible and so I do not want to rely on
> > the current implementation. This can change in future and I can
> > guarantee that nobody will think about the oom path when adding
> > something to the final __mmput path.
>
> I think ksm_exit may block too waiting for allocations, the generic
> idea is those calls before exit_mmap can cause a problem yes.

I thought that ksm used __GFP_NORETRY but haven't checked too deeply.
Anyway I guess we agree that enabling oom_reaper to race with the final
__mmput is desirable?

[...]
> > This will work more or less the same to what we have currently.
> >
> > [victim] [oom reaper] [oom killer]
> > do_exit __oom_reap_task_mm
> > mmput
> > __mmput
> > mmget_not_zero
> > test_and_set_bit(MMF_OOM_SKIP)
> > oom_evaluate_task
> > # select next victim
> > # reap the mm
> > unmap_vmas
> >
> > so we can select a next victim while the current one is still not
> > completely torn down.
>
> How does oom_evaluate_task possibly run at the same time of
> test_and_set_bit in __oom_reap_task_mm considering both are running
> under the oom_lock?

You are absolutely right. This race is impossible. It was just me
assuming we are going to get rid of the oom_lock because I have that
idea in the back of my head and I would really like to get rid of
it. Global locks are nasty and I would prefer dropping it if we can.

[...]
--
Michal Hocko
SUSE Labs