Re: [RFC 0/3] mm: Discard lazily freed pages when migrating

From: Michal Hocko
Date: Fri Feb 28 2020 - 05:15:21 EST


On Fri 28-02-20 10:50:50, Michal Hocko wrote:
> On Fri 28-02-20 16:55:40, Huang, Ying wrote:
> > David Hildenbrand <david@xxxxxxxxxx> writes:
> [...]
> > > E.g., free page reporting in QEMU wants to use MADV_FREE. The guest will
> > > report currently free pages to the hypervisor, which will MADV_FREE the
> > > reported memory. As long as there is no memory pressure, there is no
> > > need to actually free the pages. Once the guest reuses such a page, it
> > > could happen that there is still the old page and pulling in in a fresh
> > > (zeroed) page can be avoided.
> > >
> > > AFAIKs, after your change, we would get more pages discarded from our
> > > guest, resulting in more fresh (zeroed) pages having to be pulled in
> > > when a guest touches a reported free page again. But OTOH, page
> > > migration is speed up (avoiding to migrate these pages).
> >
> > Let's look at this problem in another perspective. To migrate the
> > MADV_FREE pages of the QEMU process from the node A to the node B, we
> > need to free the original pages in the node A, and (maybe) allocate the
> > same number of pages in the node B. So the question becomes
> >
> > - we may need to allocate some pages in the node B
> > - these pages may be accessed by the application or not
> > - we should allocate all these pages in advance or allocate them lazily
> > when they are accessed.
> >
> > We thought the common philosophy in Linux kernel is to allocate lazily.
>
> The common philosophy is to cache as much as possible. And MADV_FREE
> pages are a kind of cache as well. If the target node is short on memory
> then those will be reclaimed as a cache so a pro-active freeing sounds
> counter productive as you do not have any idea whether that cache is
> going to be used in future. In other words you are not going to free a
> clean page cache if you want to use that memory as a migration target

sorry I meant to say that you are not going to clean page cache when
migrating it.

> right? So you should make a clear case about why MADV_FREE cache is less
> important than the clean page cache and ideally have a good
> justification backed by real workloads.

--
Michal Hocko
SUSE Labs