Re: [PATCH v2 0/5] mm: demotion: Introduce new node state N_DEMOTION_TARGETS

From: Andrew Morton
Date: Wed Apr 13 2022 - 17:44:40 EST


On Wed, 13 Apr 2022 14:52:01 +0530 Jagdish Gediya <jvgediya@xxxxxxxxxxxxx> wrote:

> Current implementation to find the demotion targets works
> based on node state N_MEMORY, however some systems may have
> dram only memory numa node which are N_MEMORY but not the
> right choices as demotion targets.

Why are they not the right choice? Please describe this fully so we
can understand the motivation and end-user benefit of the proposed
change. And please more fully describe the end-user benefits of this
change.

> This patch series introduces the new node state
> N_DEMOTION_TARGETS, which is used to distinguish the nodes which
> can be used as demotion targets, node_states[N_DEMOTION_TARGETS]
> is used to hold the list of nodes which can be used as demotion
> targets, support is also added to set the demotion target
> list from user space so that default behavior can be overridden.

Permanently extending the kernel ABI is a fairly big deal. Please
fully explain the end-user value, usage scenarios, etc.

What would go wrong if we simply omitted this interface?

> node state N_DEMOTION_TARGETS is also set from the dax kmem
> driver, certain type of memory which registers through dax kmem
> (e.g. HBM) may not be the right choices for demotion so in future
> they should be distinguished based on certain attributes and dax
> kmem driver should avoid setting them as N_DEMOTION_TARGETS,
> however current implementation also doesn't distinguish any
> such memory and it considers all N_MEMORY as demotion targets
> so this patch series doesn't modify the current behavior.
>
> Current code which sets migration targets is modified in
> this patch series to avoid some of the limitations on the demotion
> target sharing and to use N_DEMOTION_TARGETS only nodes while
> finding demotion targets.
>
> Changelog
> ----------
>
> v2:
> In v1, only 1st patch of this patch series was sent, which was
> implemented to avoid some of the limitations on the demotion
> target sharing, however for certain numa topology, the demotion
> targets found by that patch was not most optimal, so 1st patch
> in this series is modified according to suggestions from Huang
> and Baolin. Different examples of demotion list comparasion
> between existing implementation and changed implementation can
> be found in the commit message of 1st patch.
>
> Jagdish Gediya (5):
> mm: demotion: Set demotion list differently
> mm: demotion: Add new node state N_DEMOTION_TARGETS
> mm: demotion: Add support to set targets from userspace
> device-dax/kmem: Set node state as N_DEMOTION_TARGETS
> mm: demotion: Build demotion list based on N_DEMOTION_TARGETS
>
> .../ABI/testing/sysfs-kernel-mm-numa | 12 ++++

This description is rather brief. Some additional user-facing material
under Documentation/ would help. Describe the format for writing to the
file, what is seen when reading from it, provide a bit of help to the
user so they can understand how to use it, what effects they might see,
etc.

> drivers/base/node.c | 4 ++
> drivers/dax/kmem.c | 2 +
> include/linux/nodemask.h | 1 +
> mm/migrate.c | 67 +++++++++++++++----
> 5 files changed, 72 insertions(+), 14 deletions(-)