Re: [RFC PATCH 0/10] Another Approach to Use PMEM as NUMA Node

From: Michal Hocko
Date: Thu Mar 28 2019 - 15:12:11 EST


On Thu 28-03-19 11:58:57, Yang Shi wrote:
>
>
> On 3/27/19 11:58 PM, Michal Hocko wrote:
> > On Wed 27-03-19 19:09:10, Yang Shi wrote:
> > > One question, when doing demote and promote we need define a path, for
> > > example, DRAM <-> PMEM (assume two tier memory). When determining what nodes
> > > are "DRAM" nodes, does it make sense to assume the nodes with both cpu and
> > > memory are DRAM nodes since PMEM nodes are typically cpuless nodes?
> > Do we really have to special case this for PMEM? Why cannot we simply go
> > in the zonelist order? In other words why cannot we use the same logic
> > for a larger NUMA machine and instead of swapping simply fallback to a
> > less contended NUMA node? It can be a regular DRAM, PMEM or whatever
> > other type of memory node.
>
> Thanks for the suggestion. It makes sense. However, if we don't specialize a
> pmem node, its fallback node may be a DRAM node, then the memory reclaim may
> move the inactive page to the DRAM node, it sounds not make too much sense
> since memory reclaim would prefer to move downwards (DRAM -> PMEM -> Disk).

There are certainly many details to sort out. One thing is how to handle
cpuless nodes (e.g. PMEM). Those shouldn't get any direct allocations
without an explicit binding, right? My first naive idea would be to only
migrate-on-reclaim only from the preferred node. We might need
additional heuristics but I wouldn't special case PMEM from other
cpuless NUMA nodes.
--
Michal Hocko
SUSE Labs