Re: [patch 2/2] nohz: try to avoid IPI when setting tick dependency for task

From: Frederic Weisbecker
Date: Thu Sep 03 2020 - 11:02:29 EST


On Tue, Aug 25, 2020 at 03:41:49PM -0300, Marcelo Tosatti wrote:
> When enabling per-CPU posix timers, an IPI to nohz_full CPUs might be
> performed (to re-read the dependencies and possibly not re-enter
> nohz_full on a given CPU).
>
> A common case is for applications that run on nohz_full= CPUs
> to not use POSIX timers (eg DPDK).
>
> This patch optimizes tick_nohz_dep_set_task to avoid kicking
> all nohz_full= CPUs in case the task allowed mask does not
> intersect with nohz_full= CPU mask,
> when going through tick_nohz_dep_set_task.
>
> This reduces interruptions to nohz_full= CPUs.
>
> ---
> kernel/time/tick-sched.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> Index: linux-2.6/kernel/time/tick-sched.c
> ===================================================================
> --- linux-2.6.orig/kernel/time/tick-sched.c
> +++ linux-2.6/kernel/time/tick-sched.c
> @@ -383,11 +383,16 @@ void tick_nohz_dep_set_task(struct task_
> tick_nohz_full_kick();
> preempt_enable();
> } else {
> + unsigned long flags;
> +
> /*
> * Some future tick_nohz_full_kick_task()
> - * should optimize this.
> + * should further optimize this.
> */
> - tick_nohz_full_kick_all();
> + raw_spin_lock_irqsave(&tsk->pi_lock, flags);
> + if (cpumask_intersects(&tsk->cpus_mask, tick_nohz_full_mask))
> + tick_nohz_full_kick_all();
> + raw_spin_unlock_irqrestore(&tsk->pi_lock, flags);
> }
> }
> }
>
>

Not long ago, Peterz suggested that we simply do:

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index f0199a4ba1ad..42ce8e458013 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -357,17 +357,26 @@ EXPORT_SYMBOL_GPL(tick_nohz_dep_clear_cpu);
void tick_nohz_dep_set_task(struct task_struct *tsk, enum tick_dep_bits bit)
{
if (!atomic_fetch_or(BIT(bit), &tsk->tick_dep_mask)) {
- if (tsk == current) {
- preempt_disable();
- tick_nohz_full_kick();
- preempt_enable();
- } else {
- /*
- * Some future tick_nohz_full_kick_task()
- * should optimize this.
- */
- tick_nohz_full_kick_all();
- }
+ int cpu = task_cpu(tsk);
+
+ /*
+ * If the task concurrently migrates to another cpu,
+ * we guarantee it sees the new tick dependency upon
+ * schedule.
+ *
+ * set_task_cpu(p, cpu);
+ * STORE p->cpu = @cpu
+ * __schedule() (switch to task 'p')
+ * LOCK rq->lock
+ * smp_mb__after_spin_lock() STORE p->tick_dep_mask
+ * tick_nohz_task_switch() smp_mb() (atomic_fetch_or())
+ * LOAD p->tick_dep_mask LOAD p->cpu
+ */
+
+ preempt_disable();
+ if (cpu_online(cpu))
+ tick_nohz_full_kick_cpu(cpu);
+ preempt_enable();
}
}
EXPORT_SYMBOL_GPL(tick_nohz_dep_set_task);