[RFC PATCH 17/18] cgroup/cpuset: Documentation updates & don't use CPU 0 for isolated partition

From: Waiman Long
Date: Fri Aug 08 2025 - 11:24:46 EST


As CPU hotplug is now used to improve CPU isolation of CPUs in isolated
partitions. The boot CPU (typically CPU 0) cannot be put offline
impacting the amount of CPU isolation available. Now we have to advise
users that the boot CPU should never be used for isolated partitions. A
warning will be printed when boot CPU is used and the cgroup-v2.rst is
updated accordingly. The test_cpuset_prs.sh selftest is also updated
to remove CPU 0 when forming isolated partitions.

Also update the cgroup-v2.rst file to show the need to specify the
"nohz_full" kernel boot parameter to enable better nohz_full behavior
for the CPUs in isolated partitions as well as the latency spike issue
with using CPU hotplug.

Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
---
Documentation/admin-guide/cgroup-v2.rst | 33 +++++++++++++++----
.../selftests/cgroup/test_cpuset_prs.sh | 8 ++---
2 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index d9d3cc7df348..26213383b34b 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2556,11 +2556,12 @@ Cpuset Interface Files

It accepts only the following input values when written to.

- ========== =====================================
+ ========== ===============================================
"member" Non-root member of a partition
"root" Partition root
- "isolated" Partition root without load balancing
- ========== =====================================
+ "isolated" Partition root without load balancing and other
+ OS noises
+ ========== ===============================================

A cpuset partition is a collection of cpuset-enabled cgroups with
a partition root at the top of the hierarchy and its descendants
@@ -2593,9 +2594,29 @@ Cpuset Interface Files

When set to "isolated", the CPUs in that partition will be in
an isolated state without any load balancing from the scheduler
- and excluded from the unbound workqueues. Tasks placed in such
- a partition with multiple CPUs should be carefully distributed
- and bound to each of the individual CPUs for optimal performance.
+ and excluded from the unbound workqueues as well as without
+ other OS noises. Tasks placed in such a partition with multiple
+ CPUs should be carefully distributed and bound to each of the
+ individual CPUs for optimal performance.
+
+ As CPU hotplug, if supported, is used to improve the degree of
+ CPU isolation close to the "nohz_full" kernel boot parameter.
+ The boot CPU (typically CPU 0) cannot be brought offline, so the
+ boot CPU should not be used for forming isolated partitions.
+ The "nohz_full" kernel boot parameter needs to be present to
+ enable full dynticks support and RCU no-callback CPU mode for
+ CPUs in isolated partitions even if the optional cpu list
+ isn't provided. Without that, adding the "rcu_nocbs" boot
+ kernel parameter without the cpu list can be used to enable
+ RCU no-callback CPU mode without full dynticks.
+
+ Using CPU hotplug for creating or destroying an isolated
+ partition can cause latency spike in applications running
+ in other isolated partitions. A reserved list of CPUs can
+ optionally be put in the "nohz_full" kernel boot parameter to
+ alleviate this problem. When these reserved CPUs are used for
+ isolated partitions, CPU hotplug won't need to be invoked and
+ so there won't be latency spike in other isolated partitions.

A partition root ("root" or "isolated") can be in one of the
two possible states - valid or invalid. An invalid partition
diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
index a17256d9f88a..f61369be8bf6 100755
--- a/tools/testing/selftests/cgroup/test_cpuset_prs.sh
+++ b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
@@ -318,8 +318,8 @@ TEST_MATRIX=(
# Invalid to valid local partition direct transition tests
" C1-3:S+:P2 X4:P2 . . . . . . 0 A1:1-3|XA1:1-3|A2:1-3:XA2: A1:P2|A2:P-2 1-3"
" C1-3:S+:P2 X4:P2 . . . X3:P2 . . 0 A1:1-2|XA1:1-3|A2:3:XA2:3 A1:P2|A2:P2 1-3"
- " C0-3:P2 . . C4-6 C0-4 . . . 0 A1:0-4|B1:4-6 A1:P-2|B1:P0"
- " C0-3:P2 . . C4-6 C0-4:C0-3 . . . 0 A1:0-3|B1:4-6 A1:P2|B1:P0 0-3"
+ " C1-3:P2 . . C4-6 C1-4 . . . 0 A1:1-4|B1:4-6 A1:P-2|B1:P0"
+ " C1-3:P2 . . C4-6 C1-4:C1-3 . . . 0 A1:1-3|B1:4-6 A1:P2|B1:P0 1-3"

# Local partition invalidation tests
" C0-3:X1-3:S+:P2 C1-3:X2-3:S+:P2 C2-3:X3:P2 \
@@ -329,8 +329,8 @@ TEST_MATRIX=(
" C0-3:X1-3:S+:P2 C1-3:X2-3:S+:P2 C2-3:X3:P2 \
. . C4:X . . 0 A1:1-3|A2:1-3|A3:2-3|XA2:|XA3: A1:P2|A2:P-2|A3:P-2 1-3"
# Local partition CPU change tests
- " C0-5:S+:P2 C4-5:S+:P1 . . . C3-5 . . 0 A1:0-2|A2:3-5 A1:P2|A2:P1 0-2"
- " C0-5:S+:P2 C4-5:S+:P1 . . C1-5 . . . 0 A1:1-3|A2:4-5 A1:P2|A2:P1 1-3"
+ " C1-5:S+:P2 C4-5:S+:P1 . . . C3-5 . . 0 A1:1-2|A2:3-5 A1:P2|A2:P1 1-2"
+ " C1-5:S+:P2 C4-5:S+:P1 . . C2-5 . . . 0 A1:2-3|A2:4-5 A1:P2|A2:P1 2-3"

# cpus_allowed/exclusive_cpus update tests
" C0-3:X2-3:S+ C1-3:X2-3:S+ C2-3:X2-3 \
--
2.50.0