Re: [PATCH 3/3] mm: page allocator: Drain per-cpu lists afterdirect reclaim allocation fails

From: Andrew Morton
Date: Fri Sep 03 2010 - 19:01:07 EST


On Fri, 3 Sep 2010 10:08:46 +0100
Mel Gorman <mel@xxxxxxxxx> wrote:

> When under significant memory pressure, a process enters direct reclaim
> and immediately afterwards tries to allocate a page. If it fails and no
> further progress is made, it's possible the system will go OOM. However,
> on systems with large amounts of memory, it's possible that a significant
> number of pages are on per-cpu lists and inaccessible to the calling
> process. This leads to a process entering direct reclaim more often than
> it should increasing the pressure on the system and compounding the problem.
>
> This patch notes that if direct reclaim is making progress but
> allocations are still failing that the system is already under heavy
> pressure. In this case, it drains the per-cpu lists and tries the
> allocation a second time before continuing.
>
> Signed-off-by: Mel Gorman <mel@xxxxxxxxx>
> Reviewed-by: Minchan Kim <minchan.kim@xxxxxxxxx>
> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx>
> Reviewed-by: Christoph Lameter <cl@xxxxxxxxx>
> ---
> mm/page_alloc.c | 20 ++++++++++++++++----
> 1 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index bbaa959..750e1dc 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1847,6 +1847,7 @@ __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order,
> struct page *page = NULL;
> struct reclaim_state reclaim_state;
> struct task_struct *p = current;
> + bool drained = false;
>
> cond_resched();
>
> @@ -1865,14 +1866,25 @@ __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order,
>
> cond_resched();
>
> - if (order != 0)
> - drain_all_pages();
> + if (unlikely(!(*did_some_progress)))
> + return NULL;
>
> - if (likely(*did_some_progress))
> - page = get_page_from_freelist(gfp_mask, nodemask, order,
> +retry:
> + page = get_page_from_freelist(gfp_mask, nodemask, order,
> zonelist, high_zoneidx,
> alloc_flags, preferred_zone,
> migratetype);
> +
> + /*
> + * If an allocation failed after direct reclaim, it could be because
> + * pages are pinned on the per-cpu lists. Drain them and try again
> + */
> + if (!page && !drained) {
> + drain_all_pages();
> + drained = true;
> + goto retry;
> + }
> +
> return page;
> }

The patch looks reasonable.

But please take a look at the recent thread "mm: minute-long livelocks
in memory reclaim". There, people are pointing fingers at that
drain_all_pages() call, suspecting that it's causing huge IPI storms.

Dave was going to test this theory but afaik hasn't yet done so. It
would be nice to tie these threads together if poss?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/