Re: [PATCH 1/2] lumpy reclaim: clean up and write lumpy reclaim

From: KOSAKI Motohiro
Date: Wed Jun 10 2009 - 02:33:21 EST


> On Wed, 10 Jun 2009 15:11:21 +0900 (JST)
> KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
>
> > > I think lumpy reclaim should be updated to meet to current split-lru.
> > > This patch includes bugfix and cleanup. How do you think ?
> > >
> > > ==
> > > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
> > >
> > > In lumpty reclaim, "cursor_page" is found just by pfn. Then, we don't know
> > > where "cursor" page came from. Then, putback it to "src" list is BUG.
> > > And as pointed out, current lumpy reclaim doens't seem to
> > > work as originally designed and a bit complicated. This patch adds a
> > > function try_lumpy_reclaim() and rewrite the logic.
> > >
> > > The major changes from current lumpy reclaim is
> > > - check migratetype before aggressive retry at failure.
> > > - check PG_unevictable at failure.
> > > - scan is done in buddy system order. This is a help for creating
> > > a lump around targeted page. We'll create a continuous pages for buddy
> > > allocator as far as we can _around_ reclaim target page.
> > >
> > > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
> > > ---
> > > mm/vmscan.c | 120 +++++++++++++++++++++++++++++++++++-------------------------
> > > 1 file changed, 71 insertions(+), 49 deletions(-)
> > >
> > > Index: mmotm-2.6.30-Jun10/mm/vmscan.c
> > > ===================================================================
> > > --- mmotm-2.6.30-Jun10.orig/mm/vmscan.c
> > > +++ mmotm-2.6.30-Jun10/mm/vmscan.c
> > > @@ -850,6 +850,69 @@ int __isolate_lru_page(struct page *page
> > > return ret;
> > > }
> > >
> > > +static int
> > > +try_lumpy_reclaim(struct page *page, struct list_head *dst, int request_order)
> > > +{
> > > + unsigned long buddy_base, buddy_idx, buddy_start_pfn, buddy_end_pfn;
> > > + unsigned long pfn, page_pfn, page_idx;
> > > + int zone_id, order, type;
> > > + int do_aggressive = 0;
> > > + int nr = 0;
> > > + /*
> > > + * Lumpy reqraim. Try to take near pages in requested order to
> > > + * create free continous pages. This algorithm tries to start
> > > + * from order 0 and scan buddy pages up to request_order.
> > > + * If you are unsure about buddy position calclation, please see
> > > + * mm/page_alloc.c
> > > + */
> > > + zone_id = page_zone_id(page);
> > > + page_pfn = page_to_pfn(page);
> > > + buddy_base = page_pfn & ~((1 << MAX_ORDER) - 1);
> > > +
> > > + /* Can we expect succesful reclaim ? */
> > > + type = get_pageblock_migratetype(page);
> > > + if ((type == MIGRATE_MOVABLE) || (type == MIGRATE_RECLAIMABLE))
> > > + do_aggressive = 1;
> > > +
> > > + for (order = 0; order < request_order; ++order) {
> > > + /* offset in this buddy region */
> > > + page_idx = page_pfn & ~buddy_base;
> > > + /* offset of buddy can be calculated by xor */
> > > + buddy_idx = page_idx ^ (1 << order);
> > > + buddy_start_pfn = buddy_base + buddy_idx;
> > > + buddy_end_pfn = buddy_start_pfn + (1 << order);
> > > +
> > > + /* scan range [buddy_start_pfn...buddy_end_pfn) */
> > > + for (pfn = buddy_start_pfn; pfn < buddy_end_pfn; ++pfn) {
> > > + /* Avoid holes within the zone. */
> > > + if (unlikely(!pfn_valid_within(pfn)))
> > > + break;
> > > + page = pfn_to_page(pfn);
> > > + /*
> > > + * Check that we have not crossed a zone boundary.
> > > + * Some arch have zones not aligned to MAX_ORDER.
> > > + */
> > > + if (unlikely(page_zone_id(page) != zone_id))
> > > + break;
> > > +
> > > + /* we are always under ISOLATE_BOTH */
> > > + if (__isolate_lru_page(page, ISOLATE_BOTH, 0) == 0) {
> > > + list_move(&page->lru, dst);
> > > + nr++;
> > > + } else if (do_aggressive && !PageUnevictable(page))
> >
> > Could you explain this branch intention more?
> >
> __isolate_lru_page() can fail in following case
> - the page is not on LRU.
> This implies
> (a) the page is not for anon/file-cache
> (b) the page is taken off from LRU by shirnk_list or pagevec.
> (c) the page is free.
> - the page is temorarlly busy.
>
> So, aborting this loop here directly is not very good. But if the page is for
> kernel' usage or unevictable, contuning this loop just wastes time.
>
> Then, I used migrate_type attribute for the target page.
> migrate_type is determined per pageblock_order (This itself detemined by
> sizeo of hugepage at el. see include/linux/pageblock-flags.h)
>
> If the page is under MIGRATE_MOVABLE
> - at least 50% of nearby pages are used for GFP_MOVABLE(GFP_HIGHUSER_MOVABLE)
> the page is udner MIGRATE_REMOVABLE
> - at least 50% of nearby pages are used for GFP_TEMPORARY
>
> Then, we can expect meaningful lumpy reclaim if do_aggressive == 1.
> If do_aggressive==0, nearby pages are used for some kernel usage and not suitable
> for _this_ lumpy reclaim.
>
> How about a comment like this ?
> /*
> * __isolate_lru_page() returns busy status in many reason. If we are under
> * migrate type of MIGRATE_MOVABLE/MIGRATE_REMOVABLE, we can expect nearby
> * pages are just temporally busy and should be reclaimed later. (If the page
> * is _now_ free or being freed, __isolate_lru_page() returns -EBUSY.)
> * Then, continue this loop.
> */

OK, looks good.
thanks.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/