Re: [PATCH]vmscan: add block plug for page reclaim

From: Shaohua Li
Date: Fri Jul 29 2011 - 06:30:39 EST


On Fri, Jul 29, 2011 at 04:38:47PM +0800, Minchan Kim wrote:
> On Wed, Jul 27, 2011 at 04:45:23PM -0700, Andrew Morton wrote:
> > On Sat, 23 Jul 2011 20:49:10 +0200
> > Jens Axboe <jaxboe@xxxxxxxxxxxx> wrote:
> >
> > > > I can observe the average request size changes. Before the patch, the
> > > > average request size is about 90k from iostat (but the variation is
> > > > big). With the patch, the request size is about 100k and variation is
> > > > small.
> > >
> > > That's a good win right there, imho.
> >
> > yup. Reduced CPU consumption on that path isn't terribly exciting IMO,
> > but improved request size is significant.
>
> Fair enough.
> He didn't write down it in the description.
> At least, The description should include request size and variation instead of
> CPU consumption thing.
>
> Shaohua, Please rewrite the description although it's annoying.
that's fine. I add more description here.



per-task block plug can reduce block queue lock contention and increase request
merge. Currently page reclaim doesn't support it. I originally thought page
reclaim doesn't need it, because kswapd thread count is limited and file cache
write is done at flusher mostly.
When I test a workload with heavy swap in a 4-node machine, each CPU is doing
direct page reclaim and swap. This causes block queue lock contention. In my
test, without below patch, the CPU utilization is about 2% ~ 7%. With the
patch, the CPU utilization is about 1% ~ 3%. Disk throughput isn't changed.
>From iostat, the average request size is increased too. Before the patch,
the average request size is about 90k and the variation is big. With the patch,
the size is about 100k and the variation is small.
This should improve normal kswapd write and file cache write too (increase
request merge for example), but might not be so obvious as I explain above.

Signed-off-by: Shaohua Li <shaohua.li@xxxxxxxxx>

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 5ed24b9..8ec04b2 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1933,12 +1933,14 @@ static void shrink_zone(int priority, struct zone *zone,
enum lru_list l;
unsigned long nr_reclaimed, nr_scanned;
unsigned long nr_to_reclaim = sc->nr_to_reclaim;
+ struct blk_plug plug;

restart:
nr_reclaimed = 0;
nr_scanned = sc->nr_scanned;
get_scan_count(zone, sc, nr, priority);

+ blk_start_plug(&plug);
while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] ||
nr[LRU_INACTIVE_FILE]) {
for_each_evictable_lru(l) {
@@ -1962,6 +1964,7 @@ restart:
if (nr_reclaimed >= nr_to_reclaim && priority < DEF_PRIORITY)
break;
}
+ blk_finish_plug(&plug);
sc->nr_reclaimed += nr_reclaimed;

/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/