Re: [RFC v5 0/3] mm: make swapin readahead to gain more thp performance

From: Ebru Akagunduz
Date: Thu Feb 25 2016 - 18:30:32 EST


in Thu, Feb 25, 2016 at 05:35:50PM -0500, Rik van Riel wrote:
> On Wed, 2016-02-24 at 23:36 -0800, Hugh Dickins wrote:
>
> > Doesn't this imply that __collapse_huge_page_swapin() will initiate
> > all
> > the necessary swapins for a THP, then (given the
> > FAULT_FLAG_ALLOW_RETRY)
> > not wait for them to complete, so khugepaged will give up on that
> > extent
> > and move on to another; then after another full circuit of all the
> > mms
> > it needs to examine, it will arrive back at this extent and build a
> > THP
> > from the swapins it arranged last time.
> >
> > Which may work well when a system transitions from busy+swappingout
> > to idle+swappingin, but isn't that rather a special case?  It feels
> > (meaning, I've not measured at all) as if the inbetween busyish case
> > will waste a lot of I/O and memory on swapins that have to be
> > discarded
> > again before khugepaged has made its sedate way back to slotting them
> > in.
>
>
> There may be a fairly simple way to prevent
> that from becoming an issue.
>
> When khugepaged wakes up, it can check the
> PGSWPOUT or even the PGSTEAL_* stats for
> the system, and skip swapin readahead if
> there was swapout activity (or any page
> reclaim activity?) since the time it last
> ran.
>
> That way the swapin readahead will do
> its thing when transitioning from
> busy + swapout to idle + swapin, but not
> while the system is under permanent memory
> pressure.
>
The idea make sense for me.
> Am I forgetting anything obvious?
>
> Is this too aggressive?
>
> Not aggressive enough?
>
> Could PGPGOUT + PGSWPOUT be a useful
> in-between between just PGSWPOUT or
> PGSTEAL_*?
>
> -- 
> All rights reversed