Re: [PATCH 10/21] tracing/probe: Split trace_event related data from trace_probe

From: Masami Hiramatsu
Date: Tue Jun 18 2019 - 12:19:19 EST


On Mon, 17 Jun 2019 21:56:43 -0400
Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:

> On Sat, 1 Jun 2019 00:18:16 +0900
> Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
>
> > Split the trace_event related data from trace_probe data structure
> > and introduce trace_probe_event data structure for its folder.
> > This trace_probe_event data structure can have multiple trace_probe.
> >
> > Signed-off-by: Masami Hiramatsu <mhiramat@xxxxxxxxxx>
> > ---
> > kernel/trace/trace_kprobe.c | 99 ++++++++++++++++++++++-------------
> > kernel/trace/trace_probe.c | 53 +++++++++++++------
> > kernel/trace/trace_probe.h | 48 +++++++++++++----
> > kernel/trace/trace_uprobe.c | 123 +++++++++++++++++++++++++++++--------------
> > 4 files changed, 221 insertions(+), 102 deletions(-)
> >
> > diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> > index 9d483ad9bb6c..633edb88cd0e 100644
> > --- a/kernel/trace/trace_kprobe.c
> > +++ b/kernel/trace/trace_kprobe.c
> > @@ -180,9 +180,17 @@ unsigned long trace_kprobe_address(struct trace_kprobe *tk)
> > return addr;
> > }
> >
> > +static nokprobe_inline struct trace_kprobe *
> > +trace_kprobe_primary_from_call(struct trace_event_call *call)
> > +{
> > + struct trace_probe *tp = trace_probe_primary_from_call(call);
> > +
> > + return container_of(tp, struct trace_kprobe, tp);
>
>
> Hmm, is there a possibility that trace_probe_primary_from_call() may
> not have a primary?

Good question! Of course if given event_call is not a kprobe event,
it doesn't have primary (or any) trace_probe. But that must not happen
unless user misuses it.
And that list never be the empty, when the last trace probe is released,
the event_call also unregistered and released. See unregister_trace_kprobe()
for details. If there is no siblings on the list, the event_call is also
unregistered before unregistering kprobes, and after unregistering kprobes
the list is unlinked.
(Note that unregister_kprobe() will wait a quiescence period
before return. This means all probe handlers are done before that.)

> > +}
> > +
> > bool trace_kprobe_on_func_entry(struct trace_event_call *call)
> > {
> > - struct trace_kprobe *tk = (struct trace_kprobe *)call->data;
> > + struct trace_kprobe *tk = trace_kprobe_primary_from_call(call);
> >
> > return kprobe_on_func_entry(tk->rp.kp.addr,
> > tk->rp.kp.addr ? NULL : tk->rp.kp.symbol_name,
> > @@ -191,7 +199,7 @@ bool trace_kprobe_on_func_entry(struct trace_event_call *call)
> >
> > bool trace_kprobe_error_injectable(struct trace_event_call *call)
> > {
> > - struct trace_kprobe *tk = (struct trace_kprobe *)call->data;
> > + struct trace_kprobe *tk = trace_kprobe_primary_from_call(call);
> >
> > return within_error_injection_list(trace_kprobe_address(tk));
> > }
> > @@ -295,28 +303,40 @@ static inline int __enable_trace_kprobe(struct trace_kprobe *tk)
> > * Enable trace_probe
> > * if the file is NULL, enable "perf" handler, or enable "trace" handler.
> > */
> > -static int
> > -enable_trace_kprobe(struct trace_kprobe *tk, struct trace_event_file *file)
> > +static int enable_trace_kprobe(struct trace_event_call *call,
> > + struct trace_event_file *file)
> > {
> > - bool enabled = trace_probe_is_enabled(&tk->tp);
> > - int ret = 0;
> > + struct trace_probe *pos, *tp = trace_probe_primary_from_call(call);
> > + struct trace_kprobe *tk;
> > + bool enabled = trace_probe_is_enabled(tp);
> > + int ret = 0, ecode;
> >
> > if (file) {
> > - ret = trace_probe_add_file(&tk->tp, file);
> > + ret = trace_probe_add_file(tp, file);
> > if (ret)
> > return ret;
> > } else
> > - trace_probe_set_flag(&tk->tp, TP_FLAG_PROFILE);
> > + trace_probe_set_flag(tp, TP_FLAG_PROFILE);
> >
> > if (enabled)
> > return 0;
> >
> > - ret = __enable_trace_kprobe(tk);
> > - if (ret) {
> > + enabled = false;
> > + list_for_each_entry(pos, trace_probe_probe_list(tp), list) {
> > + tk = container_of(pos, struct trace_kprobe, tp);
> > + ecode = __enable_trace_kprobe(tk);
> > + if (ecode)
> > + ret = ecode; /* Save the last error code */
> > + else
> > + enabled = true;
>
> So, if we have some enabled but return an error code, what should a
> caller think of that? Wouldn't it be an inconsistent state?

Oops, good catch!
This part is related to caller (ftrace/perf) so should be more careful.
Usually, kprobe enablement should not fail. If one of them has
gone (like a probe on unloaded module), it can be fail but that
should be ignored. I would like to add some additional check so that
- If all kprobes are on the module which is unloaded, enablement
must be failed and return error.
- If any kprobe is enabled, and others are on non-exist modules,
it should succeeded and return OK.
- If any kprobe caused an error not because of unloaded module,
all other enablement should be canceled and return error.

Is that OK for you?

Thank you,

>
> -- Steve
>
>
> > + }
> > +
> > + if (!enabled) {
> > + /* No probe is enabled. Roll back */
> > if (file)
> > - trace_probe_remove_file(&tk->tp, file);
> > + trace_probe_remove_file(tp, file);
> > else
> > - trace_probe_clear_flag(&tk->tp, TP_FLAG_PROFILE);
> > + trace_probe_clear_flag(tp, TP_FLAG_PROFILE);
> > }
> >
> >
>
>
> > +static inline struct trace_probe_event *
> > +trace_probe_event_from_call(struct trace_event_call *event_call)
> > +{
> > + return container_of(event_call, struct trace_probe_event, call);
> > +}
> > +
> > +static inline struct trace_probe *
> > +trace_probe_primary_from_call(struct trace_event_call *call)
> > +{
> > + struct trace_probe_event *tpe = trace_probe_event_from_call(call);
> > +
> > + return list_first_entry(&tpe->probes, struct trace_probe, list);
> > +}
> > +
> > +static inline struct list_head *trace_probe_probe_list(struct trace_probe *tp)
> > +{
> > + return &tp->event->probes;
> > }
> >


--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>