Re: [RFC PATCH 9/9] sched/deadline: Allow deeper hierarchies of RT cgroups
From: Juri Lelli
Date: Tue Jun 10 2025 - 05:25:03 EST
Hello,
On 05/06/25 09:14, Yuri Andriaccio wrote:
> From: luca abeni <luca.abeni@xxxxxxxxxxxxxxx>
>
> Allow creation of cgroup hierachies with depth greater than two.
> Add check to prevent attaching tasks to a child cgroup of an active cgroup (i.e.
> with a running FIFO/RR task).
> Add check to prevent attaching tasks to cgroups which have children with
> non-zero runtime.
> Update rt-cgroups allocated bandwidth accounting for nested cgroup hierachies.
>
> Co-developed-by: Yuri Andriaccio <yurand2000@xxxxxxxxx>
> Signed-off-by: Yuri Andriaccio <yurand2000@xxxxxxxxx>
> Signed-off-by: luca abeni <luca.abeni@xxxxxxxxxxxxxxx>
> ---
> kernel/sched/core.c | 6 ----
> kernel/sched/deadline.c | 69 ++++++++++++++++++++++++++++++++++-------
> kernel/sched/rt.c | 25 +++++++++++++--
> kernel/sched/sched.h | 2 +-
> kernel/sched/syscalls.c | 4 +++
> 5 files changed, 84 insertions(+), 22 deletions(-)
...
> @@ -434,24 +463,40 @@ int dl_init_tg(struct sched_dl_entity *dl_se, u64 rt_runtime, u64 rt_period)
> if (rt_period & (1ULL << 63))
> return 0;
>
> + is_active_group = is_active_sched_group(tg);
> +
> raw_spin_rq_lock_irq(rq);
> is_active = dl_se->my_q->rt.rt_nr_running > 0;
> old_runtime = dl_se->dl_runtime;
> dl_se->dl_runtime = rt_runtime;
> dl_se->dl_period = rt_period;
> dl_se->dl_deadline = dl_se->dl_period;
> - if (is_active) {
> - sub_running_bw(dl_se, dl_se->dl_rq);
> - } else if (dl_se->dl_non_contending) {
> - sub_running_bw(dl_se, dl_se->dl_rq);
> - dl_se->dl_non_contending = 0;
> - hrtimer_try_to_cancel(&dl_se->inactive_timer);
> + if (is_active_group) {
> + if (is_active) {
> + sub_running_bw(dl_se, dl_se->dl_rq);
> + } else if (dl_se->dl_non_contending) {
> + sub_running_bw(dl_se, dl_se->dl_rq);
> + dl_se->dl_non_contending = 0;
> + hrtimer_try_to_cancel(&dl_se->inactive_timer);
> + }
> + __sub_rq_bw(dl_se->dl_bw, dl_se->dl_rq);
> + dl_se->dl_bw = to_ratio(dl_se->dl_period, dl_se->dl_runtime);
> + __add_rq_bw(dl_se->dl_bw, dl_se->dl_rq);
> + } else {
> + dl_se->dl_bw = to_ratio(dl_se->dl_period, dl_se->dl_runtime);
> + }
> +
> + // add/remove the parent's bw
> + if (tg->parent && tg->parent != &root_task_group)
> + {
> + if (rt_runtime == 0 && old_runtime != 0 && !sched_group_has_active_siblings(tg)) {
> + __add_rq_bw(tg->parent->dl_se[cpu]->dl_bw, dl_se->dl_rq);
> + } else if (rt_runtime != 0 && old_runtime == 0 && !sched_group_has_active_siblings(tg)) {
> + __sub_rq_bw(tg->parent->dl_se[cpu]->dl_bw, dl_se->dl_rq);
> + }
Don't we need to do something also when rt_runtime changes
({in,de}creases) and old_runtime wasn't zero? Like for example giving a
bit of bandwidth back to the parent if a child bandwidth is reduced, but
not completely set to zero.
Thanks,
Juri