Re: [PATCH 0/3] Reduce watermark-related problems with the per-cpuallocator V2

From: Mel Gorman
Date: Mon Aug 23 2010 - 09:01:45 EST


On Mon, Aug 23, 2010 at 07:45:25AM -0500, Christoph Lameter wrote:
> On Mon, 23 Aug 2010, Mel Gorman wrote:
>
> > Internal IBM test teams beta testing distribution kernels have reported
> > problems on machines with a large number of CPUs whereby page allocator
> > failure messages show huge differences between the nr_free_pages vmstat
> > counter and what is available on the buddy lists. In an extreme example,
> > nr_free_pages was above the min watermark but zero pages were on the buddy
> > lists allowing the system to potentially livelock unable to make forward
> > progress unless an allocation succeeds. There is no reason why the problems
> > would not affect mainline so the following series mitigates the problems
> > in the page allocator related to to per-cpu counter drift and lists.
>
> The maximum time for which the livelock can exists is the vm stat
> interval. By default the counters are brought up to date at least once per
> second or if a certain delta was violated. Drifts are controlled by the
> delta configuration.
>

While there is a maximum time (2 seconds I think) the drift can exist
in, a machine under enough pressure can make a mess of the watermarks
during that time. If it wasn't the case, these livelocks with 0 pages
free wouldn't be happening.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/