Re: [PATCH RFC tip/core/rcu] Avoid resched_cpu() when rescheduling the current CPU

From: Paul E. McKenney
Date: Mon Jul 30 2018 - 10:59:36 EST

Next message: Tomas Janousek: "Re: [PATCH] block: fix NPE when resuming SCSI devices using blk-mq"
Previous message: Souptick Joarder: "[PATCH v2] fs/buffer: Convert return type int to vm_fault_t"
In reply to: Peter Zijlstra: "Re: [PATCH RFC tip/core/rcu] Avoid resched_cpu() when rescheduling the current CPU"
Next in thread: Peter Zijlstra: "Re: [PATCH RFC tip/core/rcu] Avoid resched_cpu() when rescheduling the current CPU"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, Jul 30, 2018 at 11:25:13AM +0200, Peter Zijlstra wrote:
> On Fri, Jul 27, 2018 at 08:49:31AM -0700, Paul E. McKenney wrote:
> > Hello, Peter,
> >
> > It occurred to me that it is wasteful to let resched_cpu() acquire
> > ->pi_lock when doing something like resched_cpu(smp_processor_id()),
>
> rq->lock

Good catch, will fix. And thank you for looking this over!

> > and that it would be better to instead use set_tsk_need_resched(current)
> > and set_preempt_need_resched().
> >
> > But is doing so really worthwhile? For that matter, are there some
> > constraints on the use of those two functions that I am failing to
> > allow for in the patch below?
>
>
> > The resched_cpu() interface is quite handy, but it does acquire the
> > specified CPU's runqueue lock, which does not come for free. This
> > commit therefore substitutes the following when directing resched_cpu()
> > at the current CPU:
> >
> > set_tsk_need_resched(current);
> > set_preempt_need_resched();
>
> That is only a valid substitute for resched_cpu(smp_processor_id()).

Understood.

> But also note how this can cause more context switches over
> resched_curr() for not checking if TIF_NEED_RESCHED wasn't already set.
>
> Something that might be more in line with
> resched_curr(smp_processor_id()) would be:
>
> preempt_disable();
> if (!test_tsk_need_resched(current)) {
> set_tsk_need_resched(current);
> set_preempt_need_resched();
> }
> preempt_enable();
>
> Where the preempt_enable() could of course instantly trigger the
> reschedule if it was the outer most one.

Ah. So should I use resched_curr() from rcu_check_callbacks(), which
is invoked from the scheduling-clock interrupt? Right now I have calls
to set_tsk_need_resched() and set_preempt_need_resched().

> > @@ -2674,10 +2675,12 @@ static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused
>
> > - resched_cpu(rdp->cpu); /* Provoke future context switch. */
>
> > + set_tsk_need_resched(current);
> > + set_preempt_need_resched();
>
> That's not obviously correct. rdp->cpu had better be smp_processor_id().

At the beginning of the function, we have:

struct rcu_data *rdp = raw_cpu_ptr(&rcu_data);

And this is in a softirq handler, so we are OK.

> > @@ -672,7 +672,8 @@ static void sync_rcu_exp_handler(void *unused)
> > rcu_report_exp_rdp(rdp);
> > } else {
> > rdp->deferred_qs = true;
> > - resched_cpu(rdp->cpu);
> > + set_tsk_need_resched(t);
> > + set_preempt_need_resched();
>
> That only works if @t == current.

At the beginning of the function, we have:

struct task_struct *t = current;

So we should be OK.

> > }
> > return;
> > }
>
> > - else
> > - resched_cpu(rdp->cpu);
> > + } else {
> > + set_tsk_need_resched(t);
> > + set_preempt_need_resched();
>
> Similar...

Same function, so we should be good here as well.

> > }
>
> > @@ -791,8 +791,10 @@ static void rcu_flavor_check_callbacks(int user)
> > if (t->rcu_read_lock_nesting > 0 ||
> > (preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK))) {
> > /* No QS, force context switch if deferred. */
> > - if (rcu_preempt_need_deferred_qs(t))
> > - resched_cpu(smp_processor_id());
> > + if (rcu_preempt_need_deferred_qs(t)) {
> > + set_tsk_need_resched(t);
> > + set_preempt_need_resched();
> > + }
>
> And another dodgy one..

And the beginning of this function also has:

struct task_struct *t = current;

So good there as well.

Should I be instead using resched_curr() on some or all of these?

kernel/rcu/tiny.c rcu_check_callbacks():

Interrupts disabled (scheduling clock interrupt), so no
point in preempt_disable(). It would make sense to check
test_tsk_need_resched(). This is handling the case where someone
disabled something over rcu_read_unlock(), but got preempted
within (or had an overly long) RCU read-side critical section.
This used to result in deadlock, but now just messes up real-time
response.

kernel/rcu/tree.c print_cpu_stall():

Interrupts disabled, so no point in preempt_disable().
It might make sense to check test_tsk_need_resched(), but
on the other hand at this point this CPU has gone for
tens of seconds without a quiescent state. Wouldn't hurt
to check, though.

kernel/rcu/tree.c rcu_check_callbacks():

Interrupts disabled (scheduling clock interrupt), so no
point in preempt_disable(). It would make sense to check
test_tsk_need_resched(). This is handling the case where someone
disabled something over rcu_read_unlock(), but got preempted
within (or had an overly long) RCU read-side critical section.
This used to result in deadlock, but now just messes up real-time
response.

kernel/rcu/tree.c rcu_process_callbacks():

Softirqs disabled (softirq handler), so no point
in preempt_disable(). It might make sense to check
test_tsk_need_resched(). This is handling the case where someone
disabled something over rcu_read_unlock(), but got preempted
within (or had an overly long) RCU read-side critical section.
This used to result in deadlock, but now just messes up real-time
response.

kernel/rcu/tree_exp.h sync_rcu_exp_handler():
kernel/rcu/tree_exp.h sync_sched_exp_handler():

Interrupts disabled (IPI handler), so no point in
preempt_disable(). It might make sense to check
test_tsk_need_resched(). This is the expedited
grace-period case. (The first is PREEMPT, the second
!PREEMPT.)

kernel/rcu/tree_plugin.h rcu_flavor_check_callbacks():

Interrupts disabled (scheduling clock interrupt), so no
point in preempt_disable(). It would make sense to check
test_tsk_need_resched(). This is handling the case where someone
disabled something over rcu_read_unlock(), but got preempted
within (or had an overly long) RCU read-side critical section.
This used to result in deadlock, but now just messes up real-time
response.

So it looks safe for me to invoke resched_curr() in all cases. I don't
believe that the extra nested preempt_disable() will be a performance
problem. Anything that I am missing here?

Thanx, Paul

Next message: Tomas Janousek: "Re: [PATCH] block: fix NPE when resuming SCSI devices using blk-mq"
Previous message: Souptick Joarder: "[PATCH v2] fs/buffer: Convert return type int to vm_fault_t"
In reply to: Peter Zijlstra: "Re: [PATCH RFC tip/core/rcu] Avoid resched_cpu() when rescheduling the current CPU"
Next in thread: Peter Zijlstra: "Re: [PATCH RFC tip/core/rcu] Avoid resched_cpu() when rescheduling the current CPU"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]