Re: [PATCH v3 1/5] mm: introduce MADV_COLD

From: Dave Hansen
Date: Thu Jun 27 2019 - 10:36:54 EST


On 6/27/19 7:02 AM, Michal Hocko wrote:
>> Is the LRU behavior part of the interface or the implementation?
>>
>> I ask because we've got something in between tossing something down the
>> LRU and swapping it: page migration. Specifically, on a system with
>> slower memory media (like persistent memory) we just migrate a page
>> instead of discarding it at reclaim:
> But we already do have interfaces for migrating the memory
> (move_pages(2)). Why should this interface duplicate that interface?
> I believe the only purpose of these two new madvise modes is to provide
> a non-destructive MADV_{DONTNEED,FREE} alteternatives. In other words,
> pageout vs. age interface.

The existing interface's problem for this case is that it has to know
exact locations where the memory is and where it should go. For
instance, if you have two sockets, you very likely want to demote DRAM
to the persistent memory DIMM sitting next to it and not go
cross-socket. To do _that_, you need to know where the existing
allocation lies so you can find the appropriate destination node.

That's not a problem for existing NUMA-enlightened apps, but it is for
everything else.

For MADV_COLD, if we defined it like this, I think we could use it for
both purposes (demotion and LRU movement):

Pages in the specified regions will be treated as less-recently-
accessed compared to pages in the system with similar access
frequencies. In contrast to MADV_DONTNEED, the contents of the
region are preserved.

It would be nice not to talk about reclaim at all since we're not
promising reclaim per se.