Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

From: Paul E. McKenney
Date: Wed Aug 08 2018 - 16:19:03 EST

Next message: Wolfram Sang: "Re: [PATCH v3 1/6] i2c: designware: use generic table matching"
Previous message: Luck, Tony: "RE: sb_edac.c lacks PCI domain support?"
In reply to: Joel Fernandes: "Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage"
Next in thread: Joel Fernandes: "Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Aug 08, 2018 at 12:24:20PM -0700, Joel Fernandes wrote:
> On Wed, Aug 8, 2018 at 7:49 AM, Paul E. McKenney
> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> [...]
> >
> >> In that case based on what you're saying, the patch I sent to using
> >> different srcu_struct for NMI is still good I guess...
> >
> > As long as you wait for both SRCU grace periods. Hmmm... Maybe that means
> > that there is still a use for synchronize_rcu_mult():
> >
> > void call_srcu_nmi(struct rcu_head *rhp, rcu_callback_t func)
> > {
> > call_srcu(&trace_srcu_struct_nmi, rhp, func);
> > }
> >
> > void call_srcu_nonmi(struct rcu_head *rhp, rcu_callback_t func)
> > {
> > call_srcu(&trace_srcu_struct_nonmi, rhp, func);
> > }
> >
> > ...
> >
> > /* Wait concurrently on the two grace periods. */
> > synchronize_rcu_mult(call_srcu_nmi, call_srcu_nonmi);
> >
> > On the other hand, I bet that doing this is just fine in your use case:
> >
> > synchronize_srcu(&trace_srcu_struct_nmi);
> > synchronize_srcu(&trace_srcu_struct_nonmi);
> >
> > But please note that synchronize_rcu_mult() is no more in my -rcu tree,
> > so if you do want it please let me know (and please let me know why it
> > is important).
>
> I did the chaining thing (one callback calling another), that should
> work too right? I believe that is needed so that the tracepoint
> callbacks are freed at one point and only when both NMI and non-NMI
> read sections have completed.

Yes, that works also. It is possible to make that happen concurrently
via atomic_dec_and_test() or similar, but if the latency is not a problem,
why bother?

> >> >> It does start to seem like a show stopper :-(
> >> >
> >> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could
> >> > be added, which would do atomic ops on sp->sda->srcu_lock_count. Not sure
> >> > whether this would be fast enough to be useful, but easy to provide:
> >> >
> >> > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */
> >> > {
> >> > int idx;
> >> >
> >> > idx = READ_ONCE(sp->srcu_idx) & 0x1;
> >> > atomic_inc(&sp->sda->srcu_lock_count[idx]);
> >> > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical section. */
> >> > return idx;
> >> > }
> >> >
> >> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
> >> > {
> >> > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical section. */
> >> > atomic_inc(&sp->sda->srcu_unlock_count[idx]);
> >> > }
> >> >
> >> > With appropriate adjustments to also allow Tiny RCU to also work.
> >> >
> >> > Note that you have to use _nmi() everywhere, not just in NMI handlers.
> >> > In fact, the NMI handlers are the one place you -don't- need to use
> >> > _nmi(), strangely enough.
> >> >
> >> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on
> >> > some architectures, for example.
> >>
> >> Continuing Steve's question on regular interrupts, do we need to use
> >> this atomic_inc API for regular interrupts as well? So I guess
> >
> > If NMIs use one srcu_struct and non-NMI uses another, the current
> > srcu_read_lock() and srcu_read_unlock() will work just fine. If any given
> > srcu_struct needs both NMI and non-NMI readers, then we really do need
> > __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct.
>
> Yes, I believe as long as in_nmi() works reliably, we can use the
> right srcu_struct (NMI vs non-NMI) and it would be fine.
>
> Going through this thread, it sounds though that this_cpu_inc may not
> be reliable on all architectures even for non-NMI interrupts and
> local_inc may be the way to go.

My understanding is that this_cpu_inc() is defined to handle interrupts,
so any architecture on which it is unreliable needs to fix its bug. ;-)

> For next merge window (not this one), lets do that then? Paul, if you
> could provide me an SRCU API that uses local_inc, then I believe that
> coupled with this patch should be all that's needed:
> https://lore.kernel.org/patchwork/patch/972657/
>
> Steve did express concern though if in_nmi() works reliably (i.e.
> tracepoint doesn't fire from "thunk" code before in_nmi() is
> available). Any thoughts on that Steve?

Agreed, not the upcoming merge window. But we do need to work out
exactly what is the right way to do this.

Thanx, Paul

Next message: Wolfram Sang: "Re: [PATCH v3 1/6] i2c: designware: use generic table matching"
Previous message: Luck, Tony: "RE: sb_edac.c lacks PCI domain support?"
In reply to: Joel Fernandes: "Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage"
Next in thread: Joel Fernandes: "Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]