[RFC PATCH 0/2] fix unbounded too_many_isolated

From: Michal Hocko
Date: Wed Jan 18 2017 - 08:54:42 EST


Hi,
this is based on top of [1]. The first patch continues in the direction
of moving some decisions to zones rather than nodes. In this case it is
the NR_ISOLATED* counters which I believe need to be zone aware as well.
See patch 1 for more information why.

The second path builds on top of that and tries to address the problem
which has been reported by Tetsuo several times already. In the
current implementation we can loop deep in the reclaim path without
any effective way out to re-evaluate our decisions about the reclaim
retries. Patch 2 says more about that but in principle we should locate
retry logic as high in the allocator chain as possible and so we should
get rid of any unbound retry loops inside the reclaim. This is what the
patch does.

I am sending this as an RFC because I am not yet sure this is the best
forward. My testing shows that the system behaves sanely.

Thoughts, comments?

[1] http://lkml.kernel.org/r/20170117103702.28542-1-mhocko@xxxxxxxxxx

Michal Hocko (2):
mm, vmscan: account the number of isolated pages per zone
mm, vmscan: do not loop on too_many_isolated for ever

include/linux/mmzone.h | 4 +--
mm/compaction.c | 16 ++++-----
mm/khugepaged.c | 4 +--
mm/memory_hotplug.c | 2 +-
mm/migrate.c | 4 +--
mm/page_alloc.c | 14 ++++----
mm/vmscan.c | 93 ++++++++++++++++++++++++++++++++------------------
mm/vmstat.c | 4 +--
8 files changed, 82 insertions(+), 59 deletions(-)