Re: [PATCH v6 0/6] Introduce multi-preference mempolicy

From: Dave Hansen
Date: Thu Jul 15 2021 - 14:55:49 EST


On 7/14/21 5:15 PM, Andrew Morton wrote:
> On Mon, 12 Jul 2021 16:09:28 +0800 Feng Tang <feng.tang@xxxxxxxxx> wrote:
>> This patch series introduces the concept of the MPOL_PREFERRED_MANY mempolicy.
>> This mempolicy mode can be used with either the set_mempolicy(2) or mbind(2)
>> interfaces. Like the MPOL_PREFERRED interface, it allows an application to set a
>> preference for nodes which will fulfil memory allocation requests. Unlike the
>> MPOL_PREFERRED mode, it takes a set of nodes. Like the MPOL_BIND interface, it
>> works over a set of nodes. Unlike MPOL_BIND, it will not cause a SIGSEGV or
>> invoke the OOM killer if those preferred nodes are not available.
> Do we have any real-world testing which demonstrates the benefits of
> all of this?

Yes, it's actually been quite useful in practice already.

If we take persistent memory media (PMEM) and hot-add/online it with the
DAX kmem driver, we get NUMA nodes with lots of capacity (~6TB is
typical) but weird performance; PMEM has good read speed, but low write
speed.

That low write speed is *so* low that it dominates the performance more
than the distance from the CPUs. Folks who want PMEM really don't care
about locality. The discussions with the testers usually go something
like this:

Tester: How do I make my test use PMEM on nodes 2 and 3?
Kernel Guys: use 'numactl --membind=2-3'
Tester: I tried that, but I'm getting allocation failures once I fill up
PMEM. Shouldn't it fall back to DRAM?
Kernel Guys: Fine, use 'numactl --preferred=2-3'
Tester: That worked, but it started using DRAM after it exhausted node 2
Kernel Guys: Dang it. I forgot --preferred ignores everything after
the first node. Fine, we'll patch the kernel.

This has happened more than once. End users want to be able to specify
a specific physical media, but don't want to have to deal with the sharp
edges of strict binding.

This has happened both with slow media like PMEM and "faster" media like
High-Bandwidth Memory.