Re: [PATCH v9 07/14] mm: multi-gen LRU: exploit locality in rmap

From: Yu Zhao
Date: Wed Apr 06 2022 - 23:05:25 EST


On Wed, Apr 6, 2022 at 8:29 PM Barry Song <21cnbao@xxxxxxxxx> wrote:
>
> On Wed, Mar 9, 2022 at 3:48 PM Yu Zhao <yuzhao@xxxxxxxxxx> wrote:
> >
> > Searching the rmap for PTEs mapping each page on an LRU list (to test
> > and clear the accessed bit) can be expensive because pages from
> > different VMAs (PA space) are not cache friendly to the rmap (VA
> > space). For workloads mostly using mapped pages, the rmap has a high
> > CPU cost in the reclaim path.
> >
> > This patch exploits spatial locality to reduce the trips into the
> > rmap. When shrink_page_list() walks the rmap and finds a young PTE, a
> > new function lru_gen_look_around() scans at most BITS_PER_LONG-1
> > adjacent PTEs. On finding another young PTE, it clears the accessed
> > bit and updates the gen counter of the page mapped by this PTE to
> > (max_seq%MAX_NR_GENS)+1.
>
> Hi Yu,
> It seems an interesting feature to save the cost of rmap. but will it lead to
> possible judging of cold pages as hot pages?
> In case a page is mapped by 20 processes, and it has been accessed
> by 5 of them, when we look around one of the 5 processes, the page
> will be young and this pte is cleared. but we still have 4 ptes which are not
> cleared. then we don't access the page for a long time, but the 4 uncleared
> PTEs will still make the page "hot" since they are not cleared, we will find
> the page is hot either due to look-arounding the 4 processes or rmapping
> the page later?

Why are the remaining 4 accessed PTEs skipped? The rmap should check
all the 20 PTEs.

Even if they were skipped, it doesn't matter. The same argument could
be made for the rest of 1 millions minus 1 pages that have been timely
scanned, on a 4GB laptop. The fundamental principle (assumption) of
MGLRU is never about making the best choices. Nothing can because it's
impossible to predict the future that well, given the complexity of
today's workloads, not on a phone, definitely not on a server that
runs mixed types of workloads. The primary goal is to avoid the worst
choices at a minimum (scanning) cost. The second goal is to pick good
ones at an acceptable cost, which probably are a half of all possible
choices.