Re: [PATCH 07/10] mm: vmscan: Block kswapd if it is encounteringpages under writeback

From: Mel Gorman
Date: Tue Mar 19 2013 - 06:58:38 EST


On Mon, Mar 18, 2013 at 07:58:27PM +0800, Wanpeng Li wrote:
> On Sun, Mar 17, 2013 at 01:04:13PM +0000, Mel Gorman wrote:
> >Historically, kswapd used to congestion_wait() at higher priorities if it
> >was not making forward progress. This made no sense as the failure to make
> >progress could be completely independent of IO. It was later replaced by
> >wait_iff_congested() and removed entirely by commit 258401a6 (mm: don't
> >wait on congested zones in balance_pgdat()) as it was duplicating logic
> >in shrink_inactive_list().
> >
> >This is problematic. If kswapd encounters many pages under writeback and
> >it continues to scan until it reaches the high watermark then it will
> >quickly skip over the pages under writeback and reclaim clean young
> >pages or push applications out to swap.
> >
> >The use of wait_iff_congested() is not suited to kswapd as it will only
> >stall if the underlying BDI is really congested or a direct reclaimer was
> >unable to write to the underlying BDI. kswapd bypasses the BDI congestion
> >as it sets PF_SWAPWRITE but even if this was taken into account then it
> >would cause direct reclaimers to stall on writeback which is not desirable.
> >
> >This patch sets a ZONE_WRITEBACK flag if direct reclaim or kswapd is
> >encountering too many pages under writeback. If this flag is set and
> >kswapd encounters a PageReclaim page under writeback then it'll assume
> >that the LRU lists are being recycled too quickly before IO can complete
> >and block waiting for some IO to complete.
> >
> >Signed-off-by: Mel Gorman <mgorman@xxxxxxx>
> >---
> > include/linux/mmzone.h | 8 ++++++++
> > mm/vmscan.c | 29 ++++++++++++++++++++++++-----
> > 2 files changed, 32 insertions(+), 5 deletions(-)
> >
> >diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> >index edd6b98..c758fb7 100644
> >--- a/include/linux/mmzone.h
> >+++ b/include/linux/mmzone.h
> >@@ -498,6 +498,9 @@ typedef enum {
> > ZONE_DIRTY, /* reclaim scanning has recently found
> > * many dirty file pages
> > */
> >+ ZONE_WRITEBACK, /* reclaim scanning has recently found
> >+ * many pages under writeback
> >+ */
> > } zone_flags_t;
> >
> > static inline void zone_set_flag(struct zone *zone, zone_flags_t flag)
> >@@ -525,6 +528,11 @@ static inline int zone_is_reclaim_dirty(const struct zone *zone)
> > return test_bit(ZONE_DIRTY, &zone->flags);
> > }
> >
> >+static inline int zone_is_reclaim_writeback(const struct zone *zone)
> >+{
> >+ return test_bit(ZONE_WRITEBACK, &zone->flags);
> >+}
> >+
> > static inline int zone_is_reclaim_locked(const struct zone *zone)
> > {
> > return test_bit(ZONE_RECLAIM_LOCKED, &zone->flags);
> >diff --git a/mm/vmscan.c b/mm/vmscan.c
> >index 493728b..7d5a932 100644
> >--- a/mm/vmscan.c
> >+++ b/mm/vmscan.c
> >@@ -725,6 +725,19 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >
> > if (PageWriteback(page)) {
> > /*
> >+ * If reclaim is encountering an excessive number of
> >+ * pages under writeback and this page is both under
>
> Is the comment should changed to "encountered an excessive number of
> pages under writeback or this page is both under writeback and PageReclaim"?
> See below:
>

I intended to check for PageReclaim as well but it got lost in a merge
error. Fixed now.

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/