Re: A bug in ftrace - dynamic fops

From: Miroslav Benes
Date: Thu Aug 11 2016 - 10:54:13 EST


On Thu, 11 Aug 2016, Steven Rostedt wrote:

> On Thu, 11 Aug 2016 16:08:58 +0200 (CEST)
> Miroslav Benes <mbenes@xxxxxxx> wrote:
>
> > /*
> > * Dynamic ops may be freed, we must make sure that all
> > * callers are done before leaving this function.
> > * The same goes for freeing the per_cpu data of the per_cpu
> > * ops.
> > *
> > * Again, normal synchronize_sched() is not good enough.
> > * We need to do a hard force of sched synchronization.
> > * This is because we use preempt_disable() to do RCU, but
> > * the function tracers can be called where RCU is not watching
> > * (like before user_exit()). We can not rely on the RCU
> > * infrastructure to do the synchronization, thus we must do it
> > * ourselves.
> > */
> > if (ops->flags & (FTRACE_OPS_FL_DYNAMIC | FTRACE_OPS_FL_PER_CPU)) {
> > schedule_on_each_cpu(ftrace_sync);
> >
> > arch_ftrace_trampoline_free(ops);
> >
> > if (ops->flags & FTRACE_OPS_FL_PER_CPU)
> > per_cpu_ops_free(ops);
> > }
> >
> > I think the wording could be interpreted in a way that ftrace is
> > responsible which is not true according to you.
>
> OK, then I should update it to be a bit more clear, that it is only
> worried about its own infrastructure and not what goes on within
> ops->func().
>
> I take feedback like yours seriously. If you are confused by it, then
> others may be too.
>
> Thus, I think there should be two points documented a bit better.
>
> #1, ops->func() is special, and can be called from any context
> including NMI. That means unless you take special care about exactly
> what functions are going to be traced (via the filter hashes, which are
> not even supported when DYNAMIC_FTRACE is not set), then the
> ops->func() must be treated with the same care as an NMI handler would
> be. (no locking, no sleeping, etc).
>
> #2, the only synchronization that ftrace will take care of is its own.
> That is, the dynamic trampolines have race conditions in the
> implementation. The above comment is all about handling its own race
> conditions, and doesn't care about what goes on within ops->func().
> Although it does give a utility if ops->func() is not recursion safe,
> but other than than, like all other function hooks, any synchronization
> within the hooks must be taken care of by the caller to
> (un)register_ftrace_function(). That includes hooks doing strange
> things (like sleeping) and then freeing the ops after the registering.
>
> Although, with #1, #2 may not be needed.

May not, but it would not hurt to include it as well. There are some good
points there and it makes clear that what I originally assumed to be a
wanted behaviour is merely a side effect.

So such two comments in the code or in the documentation would more than
satisfy me.

Thanks a lot,
Miroslav