[PATCH 1/1] cgroup/cpuset: disable sched domain rebuild when not required

From: Patrick Bellasi
Date: Wed May 23 2018 - 11:33:06 EST


The generate_sched_domains() already addresses the "special case for 99%
of systems" which require a single full sched domain at the root,
spanning all the CPUs. However, the current support is based on an
expensive sequence of operations which destroy and recreate the exact
same scheduling domain configuration.

If we notice that:

1) CPUs in "cpuset.isolcpus" are excluded from load balancing by the
isolcpus= kernel boot option, and will never be load balanced
regardless of the value of "cpuset.sched_load_balance" in any
cpuset.

2) the root cpuset has load_balance enabled by default at boot and
it's the only parameter which userspace can change at run-time.

we know that, by default, every system comes up with a complete and
properly configured set of scheduling domains covering all the CPUs.

Thus, on every system, unless the user explicitly disables load balance
for the top_cpuset, the scheduling domains already configured at boot
time by the scheduler/topology code and updated in consequence of
hotplug events, are already properly configured for cpuset too.

This configuration is the default one for 99% of the systems,
and it's also the one used by most of the Android devices which never
disable load balance from the top_cpuset.

Thus, while load balance is enabled for the top_cpuset,
destroying/rebuilding the scheduling domains at every cpuset.cpus
reconfiguration is a useless operation which will always produce the
same result.

Let's anticipate the "special" optimization within:

rebuild_sched_domains_locked()

thus completely skipping the expensive:

generate_sched_domains()
partition_sched_domains()

for all the cases we know that the scheduling domains already defined
will not be affected by whatsoever value of cpuset.cpus.

The proposed solution is the minimal variation to optimize the case for
systems with load balance enabled at the root level and without isolated
CPUs. As soon as one of these conditions is not more valid, we fall back
to the original behavior.

Signed-off-by: Patrick Bellasi <patrick.bellasi@xxxxxxx>
Cc: Li Zefan <lizefan@xxxxxxxxxx>
Cc: Tejun Heo <tj@xxxxxxxxxx>,
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Frederic Weisbecker <frederic@xxxxxxxxxx>
Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
Cc: Mike Galbraith <efault@xxxxxx>
Cc: Paul Turner <pjt@xxxxxxxxxx>
Cc: Waiman Long <longman@xxxxxxxxxx>
Cc: Juri Lelli <juri.lelli@xxxxxxxxxx>
Cc: kernel-team@xxxxxx
Cc: cgroups@xxxxxxxxxxxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxxx
---
kernel/cgroup/cpuset.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 8f586e8bdc98..cff14be94678 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -874,6 +874,11 @@ static void rebuild_sched_domains_locked(void)
!cpumask_subset(top_cpuset.effective_cpus, cpu_active_mask))
goto out;

+ /* Special case for the 99% of systems with one, full, sched domain */
+ if (!top_cpuset.isolation_count &&
+ is_sched_load_balance(&top_cpuset))
+ goto out;
+
/* Generate domain masks and attrs */
ndoms = generate_sched_domains(&doms, &attr);

--
2.15.1
---8<---


--
#include <best/regards.h>

Patrick Bellasi