Re: [RFC][PATCH -tip 0/9] tracing: kprobe-based event tracer

From: Masami Hiramatsu
Date: Thu Mar 19 2009 - 23:34:27 EST


Frederic Weisbecker wrote:
> On Thu, Mar 19, 2009 at 05:09:56PM -0400, Masami Hiramatsu wrote:
>> Hi,
>>
>> This is a series of patches which introduce a proof-of concept of
>> kprobe-based event tracer to ftrace. I think that we could port some
>> tracing features from systemtap on this vehicle.
>> This can be applied on the linux-2.6-tip tree.
>>
>> This patchset includes following changes:
>> - Add kprobe-tracer plugin
>> - Add kernel_trap_sp() on x86, ia64, power, s390, arm which are
>> ported from systemtap runtime.
>> - Add module_*probe api for repawning/removing kprobes when target
>> module is coming/going.
>>
>> It's still not unclear that the last module_*probe would better be
>> provided as APIs or just embed it in trace_kprobe.c.
>>
>> Future items:
>> - Use binary print.
>> - Add kernel_trap_sp() on other archs.
>> - Support symbol-based memory fetching (for global variables)
>> - Support primitive types(long, ulong, int, uint, etc) for args.
>> - Support indirect memory fetch from register etc.
>> - Check insertion point safety by using instruction decoder.
>>
>> kprobe-based event tracer
>> ---------------------------
>>
>> This tracer is similar to the events tracer which is based on Tracepoint
>> infrastructure. Instead of Tracepoint, this tracer is based on kprobes(kprobe
>> and kretprobe). It probes anywhere where kprobes can probe(this means, all
>> functions body except for __kprobes functions).
>>
>> Unlike the function tracer, this tracer can probe instructions inside of
>> kernel functions. It allows you to check which instruction has been executed.
>>
>> Unlike the Tracepoint based events tracer, this tracer can add new probe points
>> on the fly.
>>
>> Similar to the events tracer, this tracer doesn't need to be activated via
>> current_tracer, instead of that, just set probe points via
>> /debug/tracing/kprobe_probes.
>>
>> Synopsis of kprobe_probes:
>> p SYMBOL[+offs|-offs]|MEMADDR [FETCHARGS] : set a probe
>
>
> Ahh, I see this is not only about parameters but also about very low level
> debugging, such as registers dumps.
>
> This is very powerful.

Please take care, don't shot your foot :)
This tracer doesn't have a safety lever(e.g. instruction boundary checker) yet.
So, currently, we need to use this with objdump -d.

>
>> r SYMBOL[+0] [FETCHARGS] : set a return probe
>>
>> FETCHARGS:
>> rN : Fetch Nth register (N >= 0)
>
>
> Ah, it would be useful to have a per arch register naming here.
> So that one don't have to feel dizzy when he have to resolve,
> say edi register, to a number.

Yeah, that should be a good enhancement idea.
This patchset just focuses on implementing the basic functionality.


>> sN : Fetch Nth entry of stack (N >= 0)
>> mADDR : Fetch memory at ADDR (ADDR should be in kernel)
>> aN : Fetch function argument. (N >= 1)(*)
>> rv : Fetch return value.(**)
>> rp : Fetch return address.(**)
>>
>> (*) aN may not correct on asmlinkaged functions and at function body.
>> (**) only for return probe.
>>
>> E.g.
>> echo p do_sys_open a1 a2 a3 a4 > /debug/tracing/kprobe_probes
>>
>> This sets a kprobe on the top of do_sys_open() function with recording
>> 1st to 3rd arguments.
>>
>> echo r do_sys_open rv rp >> /debug/tracing/kprobe_probes
>>
>> This sets a kretprobe on the return point of do_sys_open() function with
>> recording return value and return address.
>>
>> echo > /debug/tracing/kprobe_probes
>>
>> This clears all probe points. and you can see the traced information via
>> /debug/tracing/trace.
>>
>> echo /debug/tracing/trace
>> # tracer: nop
>> #
>> # TASK-PID CPU# TIMESTAMP FUNCTION
>> # | | | | |
>> <...>-2376 [001] 262.389131: do_sys_open: @do_sys_open+0 0xffffff9c 0x98db83e 0x8880 0x0
>> <...>-2376 [001] 262.391166: sys_open: <-do_sys_open+0 0x5 0xc06e8ebb
>> <...>-2376 [001] 264.384876: do_sys_open: @do_sys_open+0 0xffffff9c 0x98db83e 0x8880 0x0
>> <...>-2376 [001] 264.386880: sys_open: <-do_sys_open+0 0x5 0xc06e8ebb
>> <...>-2084 [001] 265.380330: do_sys_open: @do_sys_open+0 0xffffff9c 0x804be3e 0x0 0x1b6
>> <...>-2084 [001] 265.380399: sys_open: <-do_sys_open+0 0x3 0xc06e8ebb
>>
>> @SYMBOL means that kernel hits a probe, and <-SYMBOL means kernel returns
>> from SYMBOL(e.g. "sysenter_do_call: <-sys_open+0" means kernel returns from
>> sys_open to sysenter_do_call).
>
>
> Nice :-)

Thanks ;)

>
> Frederic.
>
>
>> Documentation/ftrace.txt | 66 ++++
>> arch/arm/include/asm/ptrace.h | 3 +-
>> arch/ia64/include/asm/ptrace.h | 6 +
>> arch/powerpc/include/asm/ptrace.h | 1 +
>> arch/s390/include/asm/ptrace.h | 5 +-
>> arch/x86/include/asm/ptrace.h | 4 +-
>> include/linux/kprobes.h | 39 ++
>> kernel/kprobes.c | 250 ++++++++++++++
>> kernel/trace/Kconfig | 9 +
>> kernel/trace/Makefile | 1 +
>> kernel/trace/trace_kprobe.c | 688 +++++++++++++++++++++++++++++++++++++
>> 11 files changed, 1067 insertions(+), 5 deletions(-)
>>
>>
>> Thank you,
>>
>> --
>> Masami Hiramatsu
>>
>> Software Engineer
>> Hitachi Computer Products (America) Inc.
>> Software Solutions Division
>>
>> e-mail: mhiramat@xxxxxxxxxx
>>
>>
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>

--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@xxxxxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/