Re: [RFC PATCH tip 0/5] tracing filters with BPF

From: Masami Hiramatsu
Date: Tue Dec 03 2013 - 20:13:54 EST


(2013/12/04 3:26), Alexei Starovoitov wrote:
> On Tue, Dec 3, 2013 at 7:33 AM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>> On Tue, 3 Dec 2013 10:16:55 +0100
>> Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>>
>>
>>> So, to do the math:
>>>
>>> tracing 'all' overhead: 95 nsecs per event
>>> tracing 'eth5 + old filter' overhead: 157 nsecs per event
>>> tracing 'eth5 + BPF filter' overhead: 54 nsecs per event
>>>
>>> So via BPF and a fairly trivial filter, we are able to reduce tracing
>>> overhead for real - while old-style filters.
>>
>> Yep, seems that BPF can do what I wasn't able to do with the normal
>> filters. Although, I haven't looked at the code yet, I'm assuming that
>> the BPF works on the parameters passed into the trace event. The normal
>> filters can only process the results of the trace (what's being
>> recorded) not the parameters of the trace event itself. To get what's
>> recorded, we need to write to the buffer first, and then we decided if
>> we want to keep the event or not and discard the event from the buffer
>> if we do not.
>>
>> That method does not reduce overhead at all, and only adds to it, as
>> Alexei's tests have shown. The purpose of the filter was not to reduce
>> overhead, but to reduce filling the buffer with needless data.
>
> Precisely.
> Assumption is that filters will filter out majority of the events.
> So filter takes pt_regs as input, has to interpret them and call
> bpf_trace_printk
> if it really wants to store something for the human to see.
> We can extend bpf trace filters to return true/false to indicate
> whether TP_printk-format
> specified as part of the event should be printed as well, but imo
> that's unnecessary.
> When I was using bpf filters to debug networking bits I didn't need
> that printk format of the event. I only used event as an entry point,
> filtering out things and printing different fields vs initial event.
> More like what developers do when they sprinkle
> trace_printk/dump_stack through the code while debugging.
>
> the only inconvenience so far is to know how parameters are getting
> into registers.
> on x86-64, arg1 is in rdi, arg2 is in rsi,... I want to improve that
> after first step is done.

Actually, that part is done by the perf-probe and ftrace dynamic events
(kernel/trace/trace_probe.c). I think this generic BPF is good for
re-implementing fetch methods. :)

Thank you,

--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@xxxxxxxxxxx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/