[RFC 2/3] Use NOMEMALLOC reclaim to allow reclaim if PF_MEMALLOC is set

From: Christoph Lameter
Date: Tue Aug 14 2007 - 10:33:22 EST


If we exhaust the reserves in the page allocator when PF_MEMALLOC is set
then no longer give up but call into reclaim with PF_MEMALLOC set.

This is in essence a recursive call back into page reclaim with another
page flag (__GFP_NOMEMALLOC) set. The recursion is bounded since potential
allocations with __PF_NOMEMALLOC set will not enter that branch again.

This means that allocation under PF_MEMALLOC will no longer run out of
memory. Allocations under PF_MEMALLOC will do a limited form of reclaim
instead.

The reclaim is of particular important to stacked filesystems that may
do a lot of allocations in the write path. Reclaim will be working
as long as there are clean file backed pages to reclaim.

Signed-off-by: Christoph Lameter <clameter@xxxxxxx>

---
mm/page_alloc.c | 11 +++++++++++
1 file changed, 11 insertions(+)

Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c 2007-08-13 23:50:01.000000000 -0700
+++ linux-2.6/mm/page_alloc.c 2007-08-13 23:58:43.000000000 -0700
@@ -1306,6 +1306,17 @@ nofail_alloc:
zonelist, ALLOC_NO_WATERMARKS);
if (page)
goto got_pg;
+ /*
+ * If we are already in reclaim then the environment
+ * is already setup. We can simply call
+ * try_to_get_free_pages(). Just make sure that
+ * we do not allocate anything.
+ */
+ if (p->flags & PF_MEMALLOC && wait &&
+ try_to_free_pages(zonelist->zones, order,
+ gfp_mask | __GFP_NOMEMALLOC))
+ goto restart;
+
if (gfp_mask & __GFP_NOFAIL) {
congestion_wait(WRITE, HZ/50);
goto nofail_alloc;

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/