[PATCH 5/7] sched/core: uclamp: use TG clamps to restrict TASK clamps

From: Patrick Bellasi
Date: Mon Apr 09 2018 - 12:57:10 EST


When a task's util_clamp value is configured via sched_setattr, this
value has to be properly accounted in the corresponding clamp group
every time the task is enqueue and dequeued. When cgroups are also in
use, per-task clamp values have to be aggregated to those of the CPU's
controller's CGroup in which the task is currently living.

Let's update uclamp_cpu_get() to provide an aggregation between the task
and the TG clamp values. Every time a task is enqueued, it will be
accounted in the clamp_group which defines the smaller clamp value
between the task and the TG's ones. This mimics what already happen for
a task's CPU affinity mask when the task is also living in a cpuset.
The overall idea is that: CGroups attributes are always used to restrict
the per-task attributes.

For consistency purposes, as well as to properly inform userspace, the
sched_getattr call is updated to always return the properly aggregated
constrains as described above. This will also make sched_getattr a
convenient userpace API to know the utilization constraints enforced on
a task by the CGroups's CPU controller.

Signed-off-by: Patrick Bellasi <patrick.bellasi@xxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Tejun Heo <tj@xxxxxxxxxx>
Cc: Paul Turner <pjt@xxxxxxxxxx>
Cc: Joel Fernandes <joelaf@xxxxxxxxxx>
Cc: Steve Muckle <smuckle@xxxxxxxxxx>
Cc: Juri Lelli <juri.lelli@xxxxxxxxxx>
Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
Cc: Morten Rasmussen <morten.rasmussen@xxxxxxx>
Cc: linux-kernel@xxxxxxxxxxxxxxx
Cc: linux-pm@xxxxxxxxxxxxxxx
---
kernel/sched/core.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b8299a4f03e7..592de8d32427 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -966,9 +966,18 @@ static inline void uclamp_cpu_get(struct task_struct *p, int cpu, int clamp_id)
clamp_value = p->uclamp[clamp_id].value;
group_id = p->uclamp[clamp_id].group_id;

+#ifdef CONFIG_UCLAMP_TASK_GROUP
+ /* Use TG's clamp value to limit task specific values */
+ if (group_id == UCLAMP_NONE ||
+ clamp_value >= task_group(p)->uclamp[clamp_id].value) {
+ clamp_value = task_group(p)->uclamp[clamp_id].value;
+ group_id = task_group(p)->uclamp[clamp_id].group_id;
+ }
+#else
/* No task specific clamp values: nothing to do */
if (group_id == UCLAMP_NONE)
return;
+#endif

/* Increment the current group_id */
uc_cpu->group[group_id].tasks += 1;
@@ -5401,6 +5410,12 @@ SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
#ifdef CONFIG_UCLAMP_TASK
attr.sched_util_min = p->uclamp[UCLAMP_MIN].value;
attr.sched_util_max = p->uclamp[UCLAMP_MAX].value;
+#ifdef CONFIG_UCLAMP_TASK_GROUP
+ if (task_group(p)->uclamp[UCLAMP_MIN].value < attr.sched_util_min)
+ attr.sched_util_min = task_group(p)->uclamp[UCLAMP_MIN].value;
+ if (task_group(p)->uclamp[UCLAMP_MAX].value < attr.sched_util_max)
+ attr.sched_util_max = task_group(p)->uclamp[UCLAMP_MAX].value;
+#endif
#endif

rcu_read_unlock();
--
2.15.1