Re: [for-next][PATCH 4/4] ftrace: Add comment to why rcu_dereference_sched() is open coded

From: Joel Fernandes
Date: Wed Feb 05 2020 - 16:54:38 EST


On Wed, Feb 05, 2020 at 11:08:24AM -0500, Joel Fernandes wrote:
> On Wed, Feb 05, 2020 at 10:49:45AM -0500, Steven Rostedt wrote:
> > On Wed, 5 Feb 2020 10:42:12 -0500
> > Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
> >
> > > On Wed, Feb 05, 2020 at 09:28:47AM -0500, Steven Rostedt wrote:
> > > > On Wed, 5 Feb 2020 09:19:15 -0500
> > > > Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > > Could you paste the stack here when RCU is not watching? In trace event code
> > > > > IIRC we call rcu_enter_irqs_on() to have RCU temporarily watch, since that
> > > > > code can be called from idle loop. Should we doing the same here as well?
> > > >
> > > > Unfortunately I lost the stack trace. And the last time we tried to use
> > > > rcu_enter_irqs_on() for ftrace, we couldn't find a way to do this
> > > > properly. Ftrace is much more invasive then going into idle. The
> > > > problem is that ftrace traces RCU itself, and calling
> > > > "rcu_enter_irqs_on()" in pretty much any place in the RCU code caused
> > > > lots of bugs ;-)
> > > >
> > > > This is why we have the schedule_on_each_cpu(ftrace_sync) hack.
> > >
> > > The "schedule a task on each CPU" trick works on !PREEMPT though right?
> >
> > It works on both, as I care more about the PREEMPT=y case then
> > the !PREEMPT, and the PREEMPT_RT which is even more preemptive than
> > PREEMPT!
> >
> > >
> > > Because it is possible in PREEMPT=y to get preempted in the middle of a
> > > read-side critical section, switch to the worker thread executing the
> > > ftrace_sync() and then switch back. But RCU still has to watch that CPU since
> > > the read-side critical section was not completed.
> > >
> > > Or is there a subtlety here with ftrace that I missed?
> > >
> >
> > Hence Amol's patch:
> >
> > > + notrace_hash = rcu_dereference_protected(ftrace_graph_notrace_hash,
> > > + !preemptible());
> >
> > It checks to make sure preemption is off. There is no chance of being
> > preempted in the read side critical section.
>
> Yes, this makes sense. Sorry for the noise. For "sched" RCU cases,
> scheduling on each CPU would work regardless of PREEMPT configuration.
>
> ( I guess I was confusing this case with the non-sched RCU usages (such as using
> rcu_read_lock()) where scheduling a task on each CPU obviously would not work
> with PREEMPT=y. )
>
> By the way would SRCU not work instead of the ftrace_sync() technique? Or is
> the concern that SRCU cannot be used from NMI?

Answering my own question, SRCU would likely slow down ftrace_graph_addr()
unnecessarily so is probably not worth doing so in this path (especially
because ftrace_graph_addr() already starts an implict read-side critical
section anyway via preempt_disable()).

thanks,

- Joel