Re: [PATCH 3/3] mm: page allocator: Reconsider zones for allocationafter direct reclaim

From: Mel Gorman
Date: Thu Jul 14 2011 - 02:10:58 EST


On Thu, Jul 14, 2011 at 12:20:38PM +0900, KOSAKI Motohiro wrote:
> (2011/07/13 20:10), Mel Gorman wrote:
> > On Wed, Jul 13, 2011 at 09:42:39AM +0900, KOSAKI Motohiro wrote:
> >> (2011/07/11 22:01), Mel Gorman wrote:
> >>> With zone_reclaim_mode enabled, it's possible for zones to be considered
> >>> full in the zonelist_cache so they are skipped in the future. If the
> >>> process enters direct reclaim, the ZLC may still consider zones to be
> >>> full even after reclaiming pages. Reconsider all zones for allocation
> >>> if direct reclaim returns successfully.
> >>>
> >>> Signed-off-by: Mel Gorman <mgorman@xxxxxxx>
> >>
> >> Hmmm...
> >>
> >> I like the concept, but I'm worry about a corner case a bit.
> >>
> >> If users are using cpusets/mempolicy, direct reclaim don't scan all zones.
> >> Then, zlc_clear_zones_full() seems too aggressive operation.
> >
> > As the system is likely to be running slow if it is in direct reclaim
> > that the complexity of being careful about which zone was cleared was
> > not worth it.
> >
> >> Instead, couldn't we turn zlc->fullzones off from kswapd?
> >>
> >
> > Which zonelist should it clear (there are two) and when should it
> > happen? If it clears it on each cycle around balance_pgdat(), there
> > is no guarantee that it'll be cleared between when direct reclaim
> > finishes and an attempt is made to allocate.
>
> Hmm..
>
> Probably I'm now missing the point of this patch. Why do we need
> to guarantee tightly coupled zlc cache and direct reclaim?

Because direct reclaim may free enough memory such that the zlc cache
stating the zone is full is wrong.

> IIUC,
> zlc cache mean "to avoid free list touch if they have no free mem".
> So, any free page increasing point is acceptable good, I thought.
> In the other hand, direct reclaim finishing has no guarantee to
> zones of zonelist have enough free memory because it has bailing out logic.
>

It has no guarantee but there is a reasonable expectation that direct
reclaim will free some memory that means we should reconsider the
zone for allocation.

> So, I think we don't need to care zonelist, just kswapd turn off
> their own node.
>

I don't understand what you mean by this.

> And, just curious, If we will have a proper zlc clear point, why
> do we need to keep HZ timeout?
>

Yes because we are not guaranteed to call direct reclaim either. Memory
could be freed by a process exiting and I'd rather not add cost to
the free path to find and clear all zonelists referencing the zone the
page being freed belongs to.

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/