Re: [External] Re: [PATCH] cpuset: introduce non-blocking cpuset.mems setting option
From: Michal Koutný
Date: Thu Jun 19 2025 - 08:10:49 EST
On Thu, Jun 19, 2025 at 11:49:58AM +0800, Zhongkun He <hezhongkun.hzk@xxxxxxxxxxxxx> wrote:
> In our scenario, when we shrink the allowed cpuset.mems —for example,
> from nodes 1, 2, 3 to just nodes 2,3—there may still be a large number of pages
> residing on node 1. Currently, modifying cpuset.mems triggers synchronous memory
> migration, which results in prolonged and unacceptable service downtime under
> cgroup v2. This behavior has become a major blocker for us in adopting
> cgroup v2.
>
> Tejun suggested adding an interface to control the migration rate, and
> I plan to try that later.
It sounds unnecessarily not work-conserving and in principle adding
cond_resched()s (or eventually having a preemptible kernel) should
achieve the same. Or how would that project onto service metrics?
(But I'm not familiar with this migration path, thus I was asking about
the contention points.)
> However, we believe that the cpuset.migrate interface in cgroup v1 is
> also sufficient for our use case and is easier to work with. :)
Too easy I think, it'd make cpuset.mems only "advisory" constraint. (I
know it could be justified too but perhaps not as a solution to costly
migrations.)
Michal
Attachment:
signature.asc
Description: PGP signature