Re: [PATCH 1/4] rcu: Speed up calling of RCU tasks callbacks

From: Paul E. McKenney
Date: Wed May 23 2018 - 11:14:12 EST


On Tue, May 22, 2018 at 11:38:12PM -0700, Joel Fernandes wrote:
> From: "Joel Fernandes (Google)" <joel@xxxxxxxxxxxxxxxxx>
>
> RCU tasks callbacks can take at least 1 second before the callbacks are
> executed. This happens even if the hold-out tasks enter their quiescent states
> quickly. I noticed this when I was testing trampoline callback execution.
>
> To test the trampoline freeing, I wrote a simple script:
> cd /sys/kernel/debug/tracing/
> echo '__schedule_bug:traceon' > set_ftrace_filter;
> echo '!__schedule_bug:traceon' > set_ftrace_filter;
>
> In the background I had simple bash while loop:
> while [ 1 ]; do x=1; done &
>
> Total time of completion of above commands in seconds:
>
> With this patch:
> real 0m0.179s
> user 0m0.000s
> sys 0m0.054s
>
> Without this patch:
> real 0m1.098s
> user 0m0.000s
> sys 0m0.053s
>
> That's a greater than 6X speed up in performance. In order to accomplish
> this, I am waiting for HZ/10 time before entering the hold-out checking
> loop. The loop still preserves its checking of held tasks every 1 second
> as before, in case this first test doesn't succeed.
>
> Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>

Given an ack from Steven, I would be happy to take this, give or take
some nits below.

Thanx, Paul

> Cc: Peter Zilstra <peterz@xxxxxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Boqun Feng <boqun.feng@xxxxxxxxx>
> Cc: Paul McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> Cc: byungchul.park@xxxxxxx
> Cc: kernel-team@xxxxxxxxxxx
> Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
> ---
> kernel/rcu/update.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> index 5783bdf86e5a..a28698e44b08 100644
> --- a/kernel/rcu/update.c
> +++ b/kernel/rcu/update.c
> @@ -743,6 +743,12 @@ static int __noreturn rcu_tasks_kthread(void *arg)
> */
> synchronize_srcu(&tasks_rcu_exit_srcu);
>
> + /*
> + * Wait a little bit incase held tasks are released

in case

> + * during their next timer ticks.
> + */
> + schedule_timeout_interruptible(HZ/10);
> +
> /*
> * Each pass through the following loop scans the list
> * of holdout tasks, removing any that are no longer
> @@ -755,7 +761,6 @@ static int __noreturn rcu_tasks_kthread(void *arg)
> int rtst;
> struct task_struct *t1;
>
> - schedule_timeout_interruptible(HZ);
> rtst = READ_ONCE(rcu_task_stall_timeout);
> needreport = rtst > 0 &&
> time_after(jiffies, lastreport + rtst);
> @@ -768,6 +773,11 @@ static int __noreturn rcu_tasks_kthread(void *arg)
> check_holdout_task(t, needreport, &firstreport);
> cond_resched();
> }
> +
> + if (list_empty(&rcu_tasks_holdouts))
> + break;
> +
> + schedule_timeout_interruptible(HZ);

Is there a better way to do this? Can this be converted into a for-loop?
Alternatively, would it make sense to have a firsttime local variable
initialized to true, to keep the schedule_timeout_interruptible() at
the beginning of the loop, but skip it on the first pass through the loop?

Don't get me wrong, what you have looks functionally correct, but
duplicating the condition might cause problems later on, for example,
should a bug fix be needed in the condition.

> }
>
> /*
> --
> 2.17.0.441.gb46fe60e1d-goog
>