Re: [External] Re: [PATCH] cpuset: introduce non-blocking cpuset.mems setting option

From: Zhongkun He
Date: Wed Jun 18 2025 - 23:50:47 EST

Next message: Yuyang Huang: "[PATCH net-next, v2] selftest: add selftest for anycast notifications"
Previous message: Thangaraj.S: "Re: [PATCH v1 for-next] spi: spi_pci1xxxx: Add support for per-instance DMA interrupt vectors"
In reply to: Michal Koutný: "Re: [External] Re: [PATCH] cpuset: introduce non-blocking cpuset.mems setting option"
Next in thread: Michal Koutný: "Re: [External] Re: [PATCH] cpuset: introduce non-blocking cpuset.mems setting option"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Jun 18, 2025 at 5:05 PM Michal Koutný <mkoutny@xxxxxxxx> wrote:
>
> On Wed, Jun 18, 2025 at 10:46:02AM +0800, Zhongkun He <hezhongkun.hzk@xxxxxxxxxxxxx> wrote:
> > It is unnecessary to adjust memory affinity periodically from userspace,
> > as it is a costly operation.
>
> It'd be always costly when there's lots of data to migrate.
>
> > Instead, we need to shrink cpuset.mems to explicitly specify the NUMA
> > node from which newly allocated pages should come, and migrate the
> > pages once in userspace slowly or adjusted by numa balance.
>
> IIUC, the issue is that there's no set_mempolicy(2) for 3rd party
> threads (it only operates on current) OR that the migration path should
> be optimized to avoid those latencies -- do you know what is the
> contention point?

Hi Michal

In our scenario, when we shrink the allowed cpuset.mems —for example,
from nodes 1, 2, 3 to just nodes 2,3—there may still be a large number of pages
residing on node 1. Currently, modifying cpuset.mems triggers synchronous memory
migration, which results in prolonged and unacceptable service downtime under
cgroup v2. This behavior has become a major blocker for us in adopting
cgroup v2.

Tejun suggested adding an interface to control the migration rate, and
I plan to try
that later. However, we believe that the cpuset.migrate interface in
cgroup v1 is also
sufficient for our use case and is easier to work with. :)

Thanks,
Zhongkun

>
> Thanks,
> Michal

Next message: Yuyang Huang: "[PATCH net-next, v2] selftest: add selftest for anycast notifications"
Previous message: Thangaraj.S: "Re: [PATCH v1 for-next] spi: spi_pci1xxxx: Add support for per-instance DMA interrupt vectors"
In reply to: Michal Koutný: "Re: [External] Re: [PATCH] cpuset: introduce non-blocking cpuset.mems setting option"
Next in thread: Michal Koutný: "Re: [External] Re: [PATCH] cpuset: introduce non-blocking cpuset.mems setting option"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]