Re: [PATCH 12/13] mm: Throttle direct reclaimers if PF_MEMALLOCreserves are low and swap is backed by network storage

From: Mel Gorman
Date: Thu Apr 28 2011 - 06:14:54 EST


On Thu, Apr 28, 2011 at 10:22:44AM +1000, NeilBrown wrote:
> On Wed, 27 Apr 2011 17:08:10 +0100 Mel Gorman <mgorman@xxxxxxx> wrote:
>
>
> > +/*
> > + * Throttle direct reclaimers if backing storage is backed by the network
> > + * and the PFMEMALLOC reserve for the preferred node is getting dangerously
> > + * depleted. kswapd will continue to make progress and wake the processes
> > + * when the low watermark is reached
> > + */
> > +static void throttle_direct_reclaim(gfp_t gfp_mask, struct zonelist *zonelist,
> > + nodemask_t *nodemask)
> > +{
> > + struct zone *zone;
> > + int high_zoneidx = gfp_zone(gfp_mask);
> > + DEFINE_WAIT(wait);
> > +
> > + /* Check if the pfmemalloc reserves are ok */
> > + first_zones_zonelist(zonelist, high_zoneidx, NULL, &zone);
> > + if (pfmemalloc_watermark_ok(zone->zone_pgdat, high_zoneidx))
> > + return;
>
> As the first thing that 'wait_event_interruptible" does is test the condition
> and return if it is true, this "if () return;" is pointless.
>

In patch 13, we count the number of times we got throttled. In this
patch, the check is pointless but it makes sense in the context of
the following patch.

> > +
> > + /* Throttle */
> > + wait_event_interruptible(zone->zone_pgdat->pfmemalloc_wait,
> > + pfmemalloc_watermark_ok(zone->zone_pgdat, high_zoneidx));
> > +}
>
> I was surprised that you chose wait_event_interruptible as your previous code
> was almost exactly "wait_event_killable".
>
> Is there some justification for not throttling processes which happen to have
> a (non-fatal) signal pending?
>

No justification, wait_event_killable() is indeed a better fit.

> > +
> > unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
> > gfp_t gfp_mask, nodemask_t *nodemask)
> > {
> > @@ -2133,6 +2172,15 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
> > .nodemask = nodemask,
> > };
> >
> > + throttle_direct_reclaim(gfp_mask, zonelist, nodemask);
> > +
> > + /*
> > + * Do not enter reclaim if fatal signal is pending. 1 is returned so
> > + * that the page allocator does not consider triggering OOM
> > + */
> > + if (fatal_signal_pending(current))
> > + return 1;
> > +
> > trace_mm_vmscan_direct_reclaim_begin(order,
> > sc.may_writepage,
> > gfp_mask);
> > @@ -2488,6 +2536,12 @@ loop_again:
> > }
> >
> > }
> > +
> > + /* Wake throttled direct reclaimers if low watermark is met */
> > + if (waitqueue_active(&pgdat->pfmemalloc_wait) &&
> > + pfmemalloc_watermark_ok(pgdat, MAX_NR_ZONES - 1))
> > + wake_up_interruptible(&pgdat->pfmemalloc_wait);
> > +
> > if (all_zones_ok || (order && pgdat_balanced(pgdat, balanced, *classzone_idx)))
> > break; /* kswapd: all done */
> > /*
>

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/