Re: [patch 3/3 -mmotm] oom: invoke oom killer for __GFP_NOFAIL

From: Peter Zijlstra
Date: Tue Jun 02 2009 - 03:35:18 EST


On Tue, 2009-06-02 at 00:26 -0700, David Rientjes wrote:
> > I really think/hope/expect that this is unneeded.
> >
> > Do we know of any callsites which do greater-than-order-0 allocations
> > with GFP_NOFAIL? If so, we should fix them.
> >
> > Then just ban order>0 && GFP_NOFAIL allocations.
> >
>
> That seems like a different topic: banning higher-order __GFP_NOFAIL
> allocations or just deprecating __GFP_NOFAIL altogether and slowly
> switching users over is a worthwhile effort, but is unrelated.
>
> This patch is necessary because we explicitly deny the oom killer from
> being used when the order is greater than PAGE_ALLOC_COSTLY_ORDER because
> of an assumption that it won't help. That assumption isn't always true,
> especially for large memory-hogging tasks that have mlocked large chunks
> of contiguous memory, for example. The only thing we do know is that
> direct reclaim has not made any progress so we're unlikely to get a
> substantial amount of memory freeing in the immediate future. Such an
> instance will simply loop forever without killing that rogue task for a
> __GFP_NOFAIL allocation.
>
> So while it's better in the long-term to deprecate the flag as much as
> possible and perhaps someday remove it from the page allocator entirely,
> we're faced with the current behavior of either looping endlessly or
> freeing memory so the kernel allocation may succeed when direct reclaim
> has failed, which also makes this a rare instance where the oom killer
> will never needlessly kill a task.

I would really prefer if we do as Andrew suggests. Both will fix this
problem, so I don't see it as a different topic at all.

Eradicating __GFP_NOFAIL is a fine goal, but very hard work (people have
been wanting to do that for many years). But simply limiting it to
0-order allocation should be much(?) easier.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/