Re: [PATCH] mm/readahead.c: update the LRU positions of in-corepages, too

From: Wu Fengguang
Date: Tue Jan 26 2010 - 08:02:32 EST


On Mon, Jan 25, 2010 at 03:36:35PM -0700, Chris Frost wrote:
> I changed Wu's patch to add a PageLRU() guard that I believe is required

Good catch, Thanks!

> and optimized zone lock acquisition to only unlock and lock at zone changes.
> This optimization seems to provide a 10-20% system time improvement for
> some of my GIMP benchmarks and no improvement for other benchmarks.

OK.

> I agree that the remove and add lru list entry code looks correct.
> putback_lru_page() has to worry about a page's evictable status
> changing, but I think this code does not because it holds the page
> zone lock.
>
> Wu removed the ClearPageReadahead(page) call on in-core pages that
> Kamezawa's change added. This removal, not making this call, looks
> ok to me.
>
> Thanks Wu and Kamezawa.
>
>
> What's next?

I happen to be preparing a readahead series, will include this one :)

Thanks,
Fengguang

> ---
> readahead: retain inactive lru pages to be accessed soon
> From: Chris Frost <frost@xxxxxxxxxxx>
>
> Ensure that cached pages in the inactive list are not prematurely evicted;
> move such pages to lru head when they are covered by
> - in-kernel heuristic readahead
> - an posix_fadvise(POSIX_FADV_WILLNEED) hint from an application
>
> Before this patch, pages already in core may be evicted before the
> pages covered by the same prefetch scan but that were not yet in core.
> Many small read requests may be forced on the disk because of this behavior.
>
> In particular, posix_fadvise(... POSIX_FADV_WILLNEED) on an in-core page
> has no effect on the page's location in the LRU list, even if it is the
> next victim on the inactive list.
>
> This change helps address the performance problems we encountered
> while modifying SQLite and the GIMP to use large file prefetching.
> Overall these prefetching techniques improved the runtime of large
> benchmarks by 10-17x for these applications. More in the publication
> _Reducing Seek Overhead with Application-Directed Prefetching_ in
> USENIX ATC 2009 and at http://libprefetch.cs.ucla.edu/.
>
> Signed-off-by: Chris Frost <frost@xxxxxxxxxxx>
> Signed-off-by: Steve VanDeBogart <vandebo@xxxxxxxxxxx>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
> Signed-off-by: Wu Fengguang <fengguang.wu@xxxxxxxxx>
> ---
> readahead.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 47 insertions(+)
>
> diff --git a/mm/readahead.c b/mm/readahead.c
> index aa1aa23..c1d67ab 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -9,7 +9,9 @@
>
> #include <linux/kernel.h>
> #include <linux/fs.h>
> +#include <linux/memcontrol.h>
> #include <linux/mm.h>
> +#include <linux/mm_inline.h>
> #include <linux/module.h>
> #include <linux/blkdev.h>
> #include <linux/backing-dev.h>
> @@ -133,6 +135,43 @@ out:
> }
>
> /*
> + * The file range is expected to be accessed in near future. Move pages
> + * (possibly in inactive lru tail) to lru head, so that they are retained
> + * in memory for some reasonable time.
> + */
> +static void retain_inactive_pages(struct address_space *mapping,
> + pgoff_t index, int len)
> +{
> + int i;
> + struct page *page;
> + struct zone *zone;
> + struct zone *locked_zone = NULL;
> +
> + for (i = 0; i < len; i++) {
> + page = find_get_page(mapping, index + i);
> + if (!page)
> + continue;
> + zone = page_zone(page);
> + if (zone != locked_zone) {
> + if (locked_zone)
> + spin_unlock_irq(&locked_zone->lru_lock);
> + locked_zone = zone;
> + spin_lock_irq(&locked_zone->lru_lock);
> + }
> + if (!PageActive(page) && !PageUnevictable(page) &&
> + PageLRU(page)) {
> + int lru = page_lru_base_type(page);
> +
> + del_page_from_lru_list(zone, page, lru);
> + add_page_to_lru_list(zone, page, lru);
> + }
> + put_page(page);
> + }
> + if (locked_zone)
> + spin_unlock_irq(&locked_zone->lru_lock);
> +}
> +
> +/*
> * __do_page_cache_readahead() actually reads a chunk of disk. It allocates all
> * the pages first, then submits them all for I/O. This avoids the very bad
> * behaviour which would occur if page allocations are causing VM writeback.
> @@ -184,6 +223,14 @@ __do_page_cache_readahead(struct address_space *mapping, struct file *filp,
> }
>
> /*
> + * Normally readahead will auto stop on cached segments, so we won't
> + * hit many cached pages. If it does happen, bring the inactive pages
> + * adjecent to the newly prefetched ones(if any).
> + */
> + if (ret < nr_to_read)
> + retain_inactive_pages(mapping, offset, page_idx);
> +
> + /*
> * Now start the IO. We ignore I/O errors - if the page is not
> * uptodate then the caller will launch readpage again, and
> * will then handle the error.
>
> --
> Chris Frost
> http://www.frostnet.net/chris/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/