Re: [PATCH v4 16/27] tracing: Remove regular RCU context for _rcuidle tracepoints (again)

From: Steven Rostedt
Date: Fri Mar 06 2020 - 13:59:40 EST


On Fri, 6 Mar 2020 13:45:38 -0500
Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:

> On Fri, Mar 06, 2020 at 12:55:00PM -0500, Steven Rostedt wrote:
> > On Fri, 6 Mar 2020 11:04:28 -0500 (EST)
> > Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:
> >
> > > If we care about not adding those extra branches on the fast-path, there is
> > > an alternative way to do things: BPF could provide two distinct probe callbacks,
> > > one meant for rcuidle tracepoints (which would have the trace_rcu_enter/exit), and
> > > the other for the for 99% of the other callsites which have RCU watching.
> > >
> > > I would recommend performing benchmarks justifying the choice of one approach over
> > > the other though.
> >
> > I just whipped this up (haven't even tried to compile it), but this should
> > satisfy everyone. Those that register a callback that needs RCU protection
> > simply registers with one of the _rcu versions, and all will be done. And
> > since DO_TRACE is a macro, and rcuidle is a constant, the rcu protection
> > code will be compiled out for locations that it is not needed.
> >
> > With this, perf doesn't even need to do anything extra but register with
> > the "_rcu" version.
>
> Looks nice! Some comments below:
>
> > -- Steve
> >
> > diff --git a/include/linux/tracepoint-defs.h b/include/linux/tracepoint-defs.h
> > index b29950a19205..582dece30170 100644
> > --- a/include/linux/tracepoint-defs.h
> > +++ b/include/linux/tracepoint-defs.h
> > @@ -25,6 +25,7 @@ struct tracepoint_func {
> > void *func;
> > void *data;
> > int prio;
> > + int requires_rcu;
> > };
> >
> > struct tracepoint {
> > diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> > index 1fb11daa5c53..5f4de82ffa0f 100644
> > --- a/include/linux/tracepoint.h
> > +++ b/include/linux/tracepoint.h
> > @@ -179,25 +179,28 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
> > * For rcuidle callers, use srcu since sched-rcu \
> > * doesn't work from the idle path. \
> > */ \
> > - if (rcuidle) { \
> > + if (rcuidle) \
> > __idx = srcu_read_lock_notrace(&tracepoint_srcu);\
>
> Small addition:
> To prevent confusion, we could make more clear that SRCU here is just to
> protect the tracepoint function table and not the callbacks themselves.
>
> > - rcu_irq_enter_irqson(); \
> > - } \
> > \
> > it_func_ptr = rcu_dereference_raw((tp)->funcs); \
> > \
> > if (it_func_ptr) { \
> > do { \
> > + int rcu_flags; \
> > it_func = (it_func_ptr)->func; \
> > + if (rcuidle && \
> > + (it_func_ptr)->requires_rcu) \
> > + rcu_flags = trace_rcu_enter(); \
> > __data = (it_func_ptr)->data; \
> > ((void(*)(proto))(it_func))(args); \
> > + if (rcuidle && \
> > + (it_func_ptr)->requires_rcu) \
> > + trace_rcu_exit(rcu_flags); \
>
> Nit: If we have incurred the cost of trace_rcu_enter() once, we can call
> it only once and then call trace_rcu_exit() after the do-while loop. That way
> we pay the price only once.
>

I thought about that, but the common case is only one callback attached at
a time. To make the code complex for the non common case seemed too much
of an overkill. If we find that it does help, it's best to do that as a
separate patch because then if something goes wrong we know where it
happened.

Currently, this provides the same overhead as if each callback did it
themselves like we were proposing (but without the added need to do it for
all instances of the callback).

-- Steve