Re: [RFC PATCH 1/6] tracing: introduce sleepable tracepoints

From: Joel Fernandes
Date: Mon Nov 02 2020 - 13:51:32 EST


On Tue, Oct 27, 2020 at 09:37:08AM -0400, Mathieu Desnoyers wrote:
>
> ----- On Oct 26, 2020, at 6:43 PM, Alexei Starovoitov alexei.starovoitov@xxxxxxxxx wrote:
>
> > On Fri, Oct 23, 2020 at 03:53:47PM -0400, Michael Jeanson wrote:
> >> -#define __DO_TRACE(tp, proto, args, cond, rcuidle) \
> >> +#define __DO_TRACE(tp, proto, args, cond, rcuidle, tp_flags) \
> >> do { \
> >> struct tracepoint_func *it_func_ptr; \
> >> void *it_func; \
> >> void *__data; \
> >> int __maybe_unused __idx = 0; \
> >> + bool maysleep = (tp_flags) & TRACEPOINT_MAYSLEEP; \
> >> \
> >> if (!(cond)) \
> >> return; \
> >> @@ -170,8 +178,13 @@ static inline struct tracepoint
> >> *tracepoint_ptr_deref(tracepoint_ptr_t *p)
> >> /* srcu can't be used from NMI */ \
> >> WARN_ON_ONCE(rcuidle && in_nmi()); \
> >> \
> >> - /* keep srcu and sched-rcu usage consistent */ \
> >> - preempt_disable_notrace(); \
> >> + if (maysleep) { \
> >> + might_sleep(); \
> >
> > The main purpose of the patch set is to access user memory in tracepoints,
> > right?
>
> Yes, exactly.
>
> > In such case I suggest to use stronger might_fault() here.
> > We used might_sleep() in sleepable bpf and it wasn't enough to catch
> > a combination where sleepable hook was invoked while mm->mmap_lock was
> > taken which may cause a deadlock.
>
> Good point! We will do that for the next round.
>
> By the way, we named this "sleepable" tracepoint (with flag TRACEPOINT_MAYSLEEP),
> but we are open to a better name. Would TRACEPOINT_MAYFAULT be more descriptive ?
> (a "faultable" tracepoint sounds weird though)

What about keeping it might_sleep() here and then adding might_fault() in the
probe handler? Since the probe handler knows that it may cause page fault, it
could itself make sure about it.

One more thought: Should we make _all_ tracepoints sleepable, and then move
the preempt_disable() bit to the probe handler as needed? That could simplify
the tracepoint API as well. Steven said before that whoever registers probes
knows what they are doing so I am ok with that.

No strong feelings one way or the other, for either of these though.

thanks,

- Joel

>
> Thanks,
>
> Mathieu
>
> >
> >> + rcu_read_lock_trace(); \
> >> + } else { \
> >> + /* keep srcu and sched-rcu usage consistent */ \
> >> + preempt_disable_notrace(); \
> > > + } \
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com