Re: [PATCH 2/2] sched/deadline: Temporary copy static parameters to boosted non-DEADLINE entities

From: Juri Lelli
Date: Wed Nov 13 2019 - 04:22:52 EST


Hi,

On 12/11/19 11:51, Peter Zijlstra wrote:
> On Tue, Nov 12, 2019 at 08:50:56AM +0100, Juri Lelli wrote:
> > Boosted entities (Priority Inheritance) use static DEADLINE parameters
> > of the top priority waiter. However, there might be cases where top
> > waiter could be a non-DEADLINE entity that is currently boosted by a
> > DEADLINE entity from a different lock chain (i.e., nested priority
> > chains involving entities of non-DEADLINE classes). In this case, top
> > waiter static DEADLINE parameters could null (initialized to 0 at
> > fork()) and replenish_dl_entity() would hit a BUG().
>
> Argh!
>
> > Fix this by temporarily copying static DEADLINE parameters of top
> > DEADLINE waiter (there must be at least one in the chain(s) for the
> > problem above to happen) into boosted entities. Parameters are reset
> > during deboost.
>
> Also, yuck!

Indeed. :-(

> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -4441,19 +4441,21 @@ void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task)
> > if (!dl_prio(p->normal_prio) ||
> > (pi_task && dl_entity_preempt(&pi_task->dl, &p->dl))) {
> > p->dl.dl_boosted = 1;
> > + if (!dl_prio(p->normal_prio))
> > + __dl_copy_static(p, pi_task);
> > queue_flag |= ENQUEUE_REPLENISH;
> > } else
> > p->dl.dl_boosted = 0;
> > p->sched_class = &dl_sched_class;
>
> So I thought our basic approach was deadline inheritance and screw
> runtime accounting.
>
> Given that, I don't quite understand the REPLENISH hack there. Should we
> not simply copy dl->deadline around (and restore on unboost)?
>
> That is, should we not do something 'simple' like this:
>
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 84b26d38c929..1579c571cb83 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -522,6 +522,7 @@ struct sched_dl_entity {
> */
> s64 runtime; /* Remaining runtime for this instance */
> u64 deadline; /* Absolute deadline for this instance */
> + u64 normal_deadline;
> unsigned int flags; /* Specifying the scheduler behaviour */
>
> /*
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 26e4ffa01e7a..16164b0ba80b 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4452,9 +4452,11 @@ void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task)
> if (!dl_prio(p->normal_prio) ||
> (pi_task && dl_entity_preempt(&pi_task->dl, &p->dl))) {
> p->dl.dl_boosted = 1;
> - queue_flag |= ENQUEUE_REPLENISH;
> - } else
> + p->dl.deadline = pi_task->dl.deadline;
> + } else {
> p->dl.dl_boosted = 0;
> + p->dl.deadline = p->dl.normal_deadline;
> + }
> p->sched_class = &dl_sched_class;
> } else if (rt_prio(prio)) {
> if (dl_prio(oldprio))
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 43323f875cb9..0ad7c2797f11 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -674,6 +674,7 @@ static inline void setup_new_dl_entity(struct sched_dl_entity *dl_se)
> * spent on hardirq context, etc.).
> */
> dl_se->deadline = rq_clock(rq) + dl_se->dl_deadline;
> + dl_se->normal_deadline = dl_se->deadline;
> dl_se->runtime = dl_se->dl_runtime;
> }
>
> @@ -709,6 +710,7 @@ static void replenish_dl_entity(struct sched_dl_entity *dl_se,
> */
> if (dl_se->dl_deadline == 0) {
> dl_se->deadline = rq_clock(rq) + pi_se->dl_deadline;
> + dl_se->normal_deadline = dl_se->deadline;
> dl_se->runtime = pi_se->dl_runtime;
> }
>
> @@ -723,6 +725,7 @@ static void replenish_dl_entity(struct sched_dl_entity *dl_se,
> */
> while (dl_se->runtime <= 0) {
> dl_se->deadline += pi_se->dl_period;
> + dl_se->normal_deadline = dl_se->normal;
> dl_se->runtime += pi_se->dl_runtime;

So, the problem is more related to pi_se->dl_runtime than its deadline.
Even if we don't replenish at the instant in time when boosting happens,
the boosted task might still deplete its runtime while being boosted and
that would cause update_curr_dl() to eventually call
enqueue_task_dl(..., ENQUEUE_REPLENISH) - we don't perform runtime
enforcement on boosted tasks, but still do accounting and 'instant'
replenishment with deadline postponement ('soft CBS'). This in turn will
BUG_ON(pi_se->dl_runtime <= 0), as, in a case like Glenn's, N2 and N1
are non-deadline tasks and N1 would be using N2's (pi_se) dl_runtime to
replenish finding it to be 0.

Does it make any sense?