Re: upcoming kerneloops.org item: get_page_from_freelist

From: David Rientjes
Date: Tue Jun 30 2009 - 15:48:02 EST


On Tue, 30 Jun 2009, Nick Piggin wrote:

> > Yeah, so if test_thread_flag(TIF_MEMDIE) and __GFP_NOMEMALLOC, then it
> > makes sense to return NULL immediately following the call to the oom
> > killer for !__GFP_NOFAIL since retrying the allocation is pointless
> > (reclaim failed already and TIF_MEMDIE doesn't help us on the next
> > attempt) at that time.
>
> I don't see the importance of calling the oom killer. If a thread
> is TIF_MEMDIE, then we should not try to enter reclaim nor try to
> call the oom killer. The oom killer has already been activated and
> because it has been determined that nothing can be reclaimed...
>

Right, there's no need to call it a second time. I was referring to the
initial call that set_tsk_thread_flag(current, TIF_MEMDIE). When we
return to the page allocator from the oom killer, there's no sense in
retrying for __GFP_NOMEMALLOC and !__GFP_NOFAIL since it can't use memory
reserves anyway.

That doesn't mean the oom killer shouldn't kill current when
__GFP_NOMEMALLOC, though, because it can use memory reserves along the
exit path.

> > Calling the oom killer won't do anything since it will not kill another
> > task while another has TIF_MEMDIE to protect those memory reserves and
> > give the oom killed task a chance to exit.
>
> I don't mean the normal oom-killer path, but another call to say
> "this thread got stuck, un-kill me and look for someone else to kill"
> or somesuch.
>

Right, and my suggestion for doing that was an oom killer timeout as the
threshold for determining when a thread is "stuck," because usually that
means it's blocked in TASK_UNINTERRUPTIBLE, not because memory reserves
are empty. I'd be interested in alternative approaches other than a
timeout that determine when another task should be killed.

It's always possible that a "stuck" task has fully depleted memory
reserves and no forward progress will be made by anybody, so this is a
very bad situation to begin with.

> > Panicking when a thread with TIF_MEMDIE set cannot find any memory and the
> > allocation is __GFP_NOFAIL makes sense, but only for order 0.
>
> Why only order-0? What would you do at order>0?
>

For order > 0, it'd need to loop forever like it currently does for
__GFP_NOFAIL in __alloc_pages_high_priority(). It's possible that an
allocation will eventually succeed if another task frees memory because
its allocation succeeded (not as the result of memory being totally
unavailable, but rather fragmented enough to prevent the TIF_MEMDIE task
from succeeding for order > 0).

We'd also need to consider whether the allocation is constrained to
lowmem, in which case the panic would be premature.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/