Re: can't oom-kill zap the victim's memory?

From: Vlastimil Babka
Date: Thu Oct 08 2015 - 05:40:20 EST


On 10/07/2015 12:43 PM, Tetsuo Handa wrote:
Vlastimil Babka wrote:
On 5.10.2015 16:44, Michal Hocko wrote:
So I can see basically only few ways out of this deadlock situation.
Either we face the reality and allow small allocations (withtout
__GFP_NOFAIL) to fail after all attempts to reclaim memory have failed
(so after even OOM killer hasn't made any progress).

Note that small allocations already *can* fail if they are done in the context
of a task selected as OOM victim (i.e. TIF_MEMDIE). And yeah I've seen a case
when they failed in a code that "handled" the allocation failure with a
BUG_ON(!page).

Did You hit a race described below?

I don't know, I don't even have direct evidence of TIF_MEMDIE being set, but OOMs were happening all over the place, and I haven't found another reason why the allocation would not be too-small-to-fail otherwise.

http://lkml.kernel.org/r/201508272249.HDH81838.FtQOLMFFOVSJOH@xxxxxxxxxxxxxxxxxxx

Where was the BUG_ON(!page) ? Maybe it is a candidate for adding __GFP_NOFAIL.

Yes, I suggested so:
http://marc.info/?l=linux-kernel&m=144181523115244&w=2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/