Re: [PATCH] mm/oom_kill.c: don't kill TASK_UNINTERRUPTIBLE tasks

From: Oleg Nesterov
Date: Fri Sep 18 2015 - 12:29:47 EST


On 09/18, Christoph Lameter wrote:
>
> > But yes, such a deadlock is possible. I would really like to see the comments
> > from maintainers. In particular, I seem to recall that someone suggested to
> > try to kill another !TIF_MEMDIE process after timeout, perhaps this is what
> > we should actually do...
>
> Well yes here is a patch that kills another memdie process but there is
> some risk with such an approach of overusing the reserves.

Yes, I understand it is not that simple. And probably this is all I can
understand ;)

> --- linux.orig/mm/oom_kill.c 2015-09-18 10:38:29.601963726 -0500
> +++ linux/mm/oom_kill.c 2015-09-18 10:39:55.911699017 -0500
> @@ -265,8 +265,8 @@ enum oom_scan_t oom_scan_process_thread(
> * Don't allow any other task to have access to the reserves.
> */
> if (test_tsk_thread_flag(task, TIF_MEMDIE)) {
> - if (oc->order != -1)
> - return OOM_SCAN_ABORT;
> + if (unlikely(frozen(task)))
> + __thaw_task(task);

To simplify the discussion lets ignore PF_FROZEN, this is another issue.

I am not sure this change is enough, we need to ensure that
select_bad_process() won't pick the same task (or its sub-thread) again.

And perhaps something like

wait_event_timeout(oom_victims_wait, !oom_victims,
configurable_timeout);

before select_bad_process() makes sense?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/