Re: [PATCH] oom_reaper: close race without using oom_lock

From: Tetsuo Handa
Date: Fri Jul 21 2017 - 11:19:38 EST


Michal Hocko wrote:
> > If we ignore MMF_OOM_SKIP once, we can avoid sequence above.
>
> But we set MMF_OOM_SKIP _after_ the process lost its address space (well
> after the patch which allows to race oom reaper with the exit_mmap).
>
> >
> > Process-1 Process-2
> >
> > Takes oom_lock.
> > Fails get_page_from_freelist().
> > Enters out_of_memory().
> > Get SIGKILL.
> > Get TIF_MEMDIE.
> > Leaves out_of_memory().
> > Releases oom_lock.
> > Enters do_exit().
> > Calls __mmput().
> > Takes oom_lock.
> > Fails get_page_from_freelist().
> > Releases some memory.
> > Sets MMF_OOM_SKIP.
> > Enters out_of_memory().
> > Ignores MMF_OOM_SKIP mm once.
> > Leaves out_of_memory().
> > Releases oom_lock.
> > Succeeds get_page_from_freelist().
>
> OK, so let's say you have another task just about to jump into
> out_of_memory and ... end up in the same situation.

Right.

>
> This race is just
> unavoidable.

There is no perfect way (always timing dependent). But

>
> > Strictly speaking, this patch is independent with OOM reaper.
> > This patch increases possibility of succeeding get_page_from_freelist()
> > without sending SIGKILL. Your patch is trying to drop it silently.

we can try to reduce possibility of ending up in the same situation by
this proposal, and your proposal is irrelevant with reducing possibility of
ending up in the same situation because

> >
> > Serializing setting of MMF_OOM_SKIP with oom_lock is one approach,
> > and ignoring MMF_OOM_SKIP once without oom_lock is another approach.
>
> Or simply making sure that we only set the flag _after_ the address
> space is gone, which is what I am proposing.

the address space being gone does not guarantee that get_page_from_freelist()
shall be called before entering into out_of_memory() (e.g. preempted for seconds
between "Fails get_page_from_freelist()." and "Enters out_of_memory().").