Re: [PATCH 02/10] mm: vmscan: Obey proportional scanningrequirements for kswapd

From: Michal Hocko
Date: Fri Mar 22 2013 - 06:47:46 EST


On Fri 22-03-13 11:04:49, Michal Hocko wrote:
> On Fri 22-03-13 08:37:04, Mel Gorman wrote:
> > On Fri, Mar 22, 2013 at 08:54:27AM +0100, Michal Hocko wrote:
> > > On Thu 21-03-13 15:34:42, Mel Gorman wrote:
> > > > On Thu, Mar 21, 2013 at 04:07:55PM +0100, Michal Hocko wrote:
> > > > > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > > > > > > index 4835a7a..182ff15 100644
> > > > > > > > --- a/mm/vmscan.c
> > > > > > > > +++ b/mm/vmscan.c
> > > > > > > > @@ -1815,6 +1815,45 @@ out:
> > > > > > > > }
> > > > > > > > }
> > > > > > > >
> > > > > > > > +static void recalculate_scan_count(unsigned long nr_reclaimed,
> > > > > > > > + unsigned long nr_to_reclaim,
> > > > > > > > + unsigned long nr[NR_LRU_LISTS])
> > > > > > > > +{
> > > > > > > > + enum lru_list l;
> > > > > > > > +
> > > > > > > > + /*
> > > > > > > > + * For direct reclaim, reclaim the number of pages requested. Less
> > > > > > > > + * care is taken to ensure that scanning for each LRU is properly
> > > > > > > > + * proportional. This is unfortunate and is improper aging but
> > > > > > > > + * minimises the amount of time a process is stalled.
> > > > > > > > + */
> > > > > > > > + if (!current_is_kswapd()) {
> > > > > > > > + if (nr_reclaimed >= nr_to_reclaim) {
> > > > > > > > + for_each_evictable_lru(l)
> > > > > > > > + nr[l] = 0;
> > > > > > > > + }
> > > > > > > > + return;
> > > > > > >
> > > > > > > Heh, this is nicely cryptically said what could be done in shrink_lruvec
> > > > > > > as
> > > > > > > if (!current_is_kswapd()) {
> > > > > > > if (nr_reclaimed >= nr_to_reclaim)
> > > > > > > break;
> > > > > > > }
> > > > > > >
> > > > > >
> > > > > > Pretty much. At one point during development, this function was more
> > > > > > complex and it evolved into this without me rechecking if splitting it
> > > > > > out still made sense.
> > > > > >
> > > > > > > Besides that this is not memcg aware which I think it would break
> > > > > > > targeted reclaim which is kind of direct reclaim but it still would be
> > > > > > > good to stay proportional because it starts with DEF_PRIORITY.
> > > > > > >
> > > > > >
> > > > > > This does break memcg because it's a special sort of direct reclaim.
> > > > > >
> > > > > > > I would suggest moving this back to shrink_lruvec and update the test as
> > > > > > > follows:
> > > > > >
> > > > > > I also noticed that we check whether the scan counts need to be
> > > > > > normalised more than once
> > > > >
> > > > > I didn't mind this because it "disqualified" at least one LRU every
> > > > > round which sounds reasonable to me because all LRUs would be scanned
> > > > > proportionally.
> > > >
> > > > Once the scan count for one LRU is 0 then min will always be 0 and no
> > > > further adjustment is made. It's just redundant to check again.
> > >
> > > Hmm, I was almost sure I wrote that min should be adjusted only if it is >0
> > > in the first loop but it is not there...
> > >
> > > So for real this time.
> > > for_each_evictable_lru(l)
> > > if (nr[l] && nr[l] < min)
> > > min = nr[l];
> > >
> > > This should work, no? Everytime you shrink all LRUs you and you have
> > > reclaimed enough already you get the smallest LRU out of game. This
> > > should keep proportions evenly.
> >
> > Lets say we started like this
> >
> > LRU_INACTIVE_ANON 60
> > LRU_ACTIVE_FILE 1000
> > LRU_INACTIVE_FILE 3000
> >
> > and we've reclaimed nr_to_reclaim pages then we recalculate the number
> > of pages to scan from each list as;
> >
> > LRU_INACTIVE_ANON 0
> > LRU_ACTIVE_FILE 940
> > LRU_INACTIVE_FILE 2940
> >
> > We then shrink SWAP_CLUSTER_MAX from each LRU giving us this.
> >
> > LRU_INACTIVE_ANON 0
> > LRU_ACTIVE_FILE 908
> > LRU_INACTIVE_FILE 2908
> >
> > Then under your suggestion this would be recalculated as
> >
> > LRU_INACTIVE_ANON 0
> > LRU_ACTIVE_FILE 0
> > LRU_INACTIVE_FILE 2000
> >
> > another SWAP_CLUSTER_MAX reclaims and then it stops we stop reclaiming. I
> > might still be missing the point of your suggestion but I do not think it
> > would preserve the proportion of pages we reclaim from the anon or file LRUs.
>
> It wouldn't preserve proportion precisely because each reclaim round is
> in SWAP_CLUSTER_MAX units but it would reclaim bigger lists more than
> smaller ones which I thought was the whole point. So yes using word
> "proportionally" is unfortunate but I didn't find out better one.

OK, I have obviosly missed that you are not breaking out of the loop if
scan_adjusted. Now that I am looking at the updated patch again you just
do
if (nr_reclaimed < nr_to_reclaim || scan_adjusted)
continue;

So I thouught you would just do one round of reclaim after
nr_reclaimed >= nr_to_reclaim which din't feel right to me.

Sorry about the confusion!
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/