Re: [PATCH v4 1/5] sched/deadline: Refer to cpudl.elements atomically

From: Steven Rostedt
Date: Fri May 12 2017 - 10:25:43 EST


On Fri, 12 May 2017 14:48:45 +0900
Byungchul Park <byungchul.park@xxxxxxx> wrote:

> cpudl.elements is an instance that should be protected with a spin lock.
> Without it, the code would be insane.

And how much contention will this add? Spin locks in the scheduler code
that are shared among a domain can cause huge latency. This was why I
worked hard not to add any in the cpupri code.


>
> Current cpudl_find() has problems like,
>
> 1. cpudl.elements[0].cpu might not match with cpudl.elements[0].dl.
> 2. cpudl.elements[0].dl(u64) might not be referred atomically.
> 3. Two cpudl_maximum()s might return different values.
> 4. It's just insane.

And lockless algorithms usually are insane. But locks come with a huge
cost. What happens when we have 32 core domains. This can cause
tremendous contention and makes the entire cpu priority for deadlines
useless. Might as well rip out the code.

I haven't looked too hard into the deadline version, I may have to
spend some time doing so soon. But unfortunately, I have other critical
sections to spend brain cycles on.

-- Steve


>
> Signed-off-by: Byungchul Park <byungchul.park@xxxxxxx>
> ---
> kernel/sched/cpudeadline.c | 37 ++++++++++++++++++++++++++++++-------
> 1 file changed, 30 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
> index fba235c..6b67016 100644
> --- a/kernel/sched/cpudeadline.c
> +++ b/kernel/sched/cpudeadline.c
> @@ -131,16 +131,39 @@ int cpudl_find(struct cpudl *cp, struct task_struct *p,
> cpumask_and(later_mask, cp->free_cpus, &p->cpus_allowed)) {
> best_cpu = cpumask_any(later_mask);
> goto out;
> - } else if (cpumask_test_cpu(cpudl_maximum(cp), &p->cpus_allowed) &&
> - dl_time_before(dl_se->deadline, cp->elements[0].dl)) {
> - best_cpu = cpudl_maximum(cp);
> - if (later_mask)
> - cpumask_set_cpu(best_cpu, later_mask);
> + } else {
> + u64 cpudl_dl;
> + int cpudl_cpu;
> + int cpudl_valid;
> + unsigned long flags;
> +
> + /*
> + * Referring to cp->elements must be atomic ops.
> + */
> + raw_spin_lock_irqsave(&cp->lock, flags);
> + /*
> + * No problem even in case of very initial heap tree
> + * to which no entry has been added yet, since
> + * cp->elements[0].cpu was initialized to zero and
> + * cp->elements[0].idx was initialized to IDX_INVALID,
> + * that means the case will be filtered out at the
> + * following condition.
> + */
> + cpudl_cpu = cpudl_maximum(cp);
> + cpudl_dl = cp->elements[0].dl;
> + cpudl_valid = cp->elements[cpudl_cpu].idx;
> + raw_spin_unlock_irqrestore(&cp->lock, flags);
> +
> + if (cpudl_valid != IDX_INVALID &&
> + cpumask_test_cpu(cpudl_cpu, &p->cpus_allowed) &&
> + dl_time_before(dl_se->deadline, cpudl_dl)) {
> + best_cpu = cpudl_cpu;
> + if (later_mask)
> + cpumask_set_cpu(best_cpu, later_mask);
> + }
> }
>
> out:
> - WARN_ON(best_cpu != -1 && !cpu_present(best_cpu));
> -
> return best_cpu;
> }
>