Re: [PATCH 10/10] mm, oom: hide mm which is shared with kthread or global init

From: Tetsuo Handa
Date: Thu Jun 16 2016 - 09:15:24 EST


Michal Hocko wrote:
> On Fri 10-06-16 00:15:18, Tetsuo Handa wrote:
> [...]
> > Nobody will set MMF_OOM_REAPED flag if can_oom_reap == true on
> > CONFIG_MMU=n kernel. If a TIF_MEMDIE thread in CONFIG_MMU=n kernel
> > is blocked before exit_oom_victim() in exit_mm() from do_exit() is
> > called, the system will lock up. This is not handled in the patch
> > nor explained in the changelog.
>
> I have made it clear several times that !CONFIG_MMU is not a target
> of this patch series nor other OOM changes because I am not convinced
> issues which we are trying to solve are real on those platforms. I
> am not really sure what you are trying to achieve now with these
> !CONFIG_MMU remarks but if you see _real_ regressions for those
> configurations please describe them. This generic statements when
> CONFIG_MMU implications are put into !CONFIG_MMU context are not really
> useful. If there are possible OOM killer deadlocks without this series
> then adding these patches shouldn't make them worse.
>
> E.g. this particular patch is basically a noop for !CONFIG_MMU because
> use_mm() is MMU specific. It is also highly improbable that a task would
> share mm with init...

But this is not safe for CONFIG_MMU=y kernels as well.
can_oom_reap == false means that oom_reap_task() will not be called.
It is possible that the TIF_MEMDIE thread falls into

atomic_read(&task->signal->oom_victims) > 0 && find_lock_task_mm(task) == NULL

situation. We are still risking OOM livelock. We must somehow clear (or ignore)
TIF_MEMDIE even if oom_reap_task() is not called.

Can't we apply http://lkml.kernel.org/r/201606102323.BCC73478.FtOJHFQMSVFLOO@xxxxxxxxxxxxxxxxxxx now?