Re: [PATCH 08/10] mm, oom: task_will_free_mem should skip oom_reaped tasks

From: Tetsuo Handa
Date: Fri Jun 17 2016 - 07:35:48 EST


Michal Hocko wrote:
> From: Michal Hocko <mhocko@xxxxxxxx>
>
> 0-day robot has encountered the following:
> [ 82.694232] Out of memory: Kill process 3914 (trinity-c0) score 167 or sacrifice child
> [ 82.695110] Killed process 3914 (trinity-c0) total-vm:55864kB, anon-rss:1512kB, file-rss:1088kB, shmem-rss:25616kB
> [ 82.706724] oom_reaper: reaped process 3914 (trinity-c0), now anon-rss:0kB, file-rss:0kB, shmem-rss:26488kB
> [ 82.715540] oom_reaper: reaped process 3914 (trinity-c0), now anon-rss:0kB, file-rss:0kB, shmem-rss:26900kB
> [ 82.717662] oom_reaper: reaped process 3914 (trinity-c0), now anon-rss:0kB, file-rss:0kB, shmem-rss:26900kB
> [ 82.725804] oom_reaper: reaped process 3914 (trinity-c0), now anon-rss:0kB, file-rss:0kB, shmem-rss:27296kB
> [ 82.739091] oom_reaper: reaped process 3914 (trinity-c0), now anon-rss:0kB, file-rss:0kB, shmem-rss:28148kB
>
> oom_reaper is trying to reap the same task again and again. This
> is possible only when the oom killer is bypassed because of
> task_will_free_mem because we skip over tasks with MMF_OOM_REAPED
> already set during select_bad_process. Teach task_will_free_mem to skip
> over MMF_OOM_REAPED tasks as well because they will be unlikely to free
> anything more.

I agree that we need to prevent same mm from being selected forever. But I
feel worried about this patch. We are reaching a stage what purpose we set
TIF_MEMDIE for. mark_oom_victim() sets TIF_MEMDIE on a thread with oom_lock
held. Thus, if a mm which the TIF_MEMDIE thread is using is reapable (likely
yes), __oom_reap_task() will likely be the next thread which will get that lock
because __oom_reap_task() uses mutex_lock(&oom_lock) whereas other threads
using that mm use mutex_trylock(&oom_lock). As a result, regarding CONFIG_MMU=y
kernels, I guess that

if (task_will_free_mem(current)) {

shortcut in out_of_memory() likely becomes an useless condition. Since the OOM
reaper will quickly reap mm and set MMF_OOM_REAPED on that mm and clear
TIF_MEMDIE, other threads using that mm will fail to get TIF_MEMDIE (because
task_will_free_mem() will start returning false due to this patch) and proceed
to next OOM victim selection. The comment

* That thread will now get access to memory reserves since it has a
* pending fatal signal.

in oom_kill_process() became almost dead. Since we need a short delay in order
to allow get_page_from_freelist() to allocate from memory reclaimed by
__oom_reap_task(), this patch might increase possibility of excessively
preventing OOM-killed threads from using ALLOC_NO_WATERMARKS via TIF_MEMDIE
and increase possibility of needlessly selecting next OOM victim.

So, maybe we shouldn't let this shortcut to return false as soon as
MMF_OOM_REAPED is set.