[PATCH] mm,oom: Always sleep before retrying.

From: Tetsuo Handa
Date: Tue Dec 29 2015 - 21:01:48 EST


>From c0b5820c594343e06239f15afb35d23b4b8ac0d0 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Date: Wed, 30 Dec 2015 10:55:59 +0900
Subject: [PATCH] mm,oom: Always sleep before retrying.

When we entered into "Reclaim has failed us, start killing things"
state, sleep function is called only when mutex_trylock(&oom_lock)
in __alloc_pages_may_oom() failed or immediately after returning from
oom_kill_process() in out_of_memory(). This may be insufficient for
giving other tasks a chance to run because mutex_trylock(&oom_lock)
will not fail under non-preemptive UP kernel.

If it is a !__GFP_FS && !__GFP_NOFAIL allocation request,
__alloc_pages_may_oom() will return without sleeping, and
__alloc_pages_slowpath() will retry without sleeping.
As a result, other tasks will never acquire a chance to run.

If it is a __GFP_FS || __GFP_NOFAIL allocation request, out_of_memory()
will be called. But if the OOM victim failed to terminate before
schedule_timeout_killable(1) returns, the victim will never acquire
a chance to run again because the task which called out_of_memory()
will not sleep again.

We should not rely on mutex_trylock(&oom_lock) for a sleep. This patch
makes sure everybody sleeps before __alloc_pages_slowpath() retries.

Signed-off-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
---
mm/page_alloc.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2565154..6f7f786 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2734,7 +2734,6 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
*/
if (!mutex_trylock(&oom_lock)) {
*did_some_progress = 1;
- schedule_timeout_uninterruptible(1);
return NULL;
}

@@ -3282,6 +3281,12 @@ retry:
/* Retry as long as the OOM killer is making progress */
if (did_some_progress) {
no_progress_loops = 0;
+ /*
+ * Make sure that other tasks (e.g. OOM victims, workqueue
+ * items) are given a chance to run.
+ */
+ if (!test_thread_flag(TIF_MEMDIE))
+ schedule_timeout_uninterruptible(1);
goto retry;
}

--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/