Re: [RFC] systemtap: begin the process of using proper kernel APIs(part1: use kprobe symbol_name/offset instead of address)

From: James Bottomley
Date: Wed Jul 16 2008 - 21:56:45 EST


On Wed, 2008-07-16 at 20:05 -0400, Masami Hiramatsu wrote:
> James Bottomley wrote:
> > On Wed, 2008-07-16 at 18:40 -0400, Masami Hiramatsu wrote:
> >> James Bottomley wrote:
> >>> One of the big nasties of systemtap is the way it tries to embed
> >>> virtually the entirety of the kernel symbol table in the probe modules
> >>> it constructs. This is highly undesirable because it represents a
> >>> subversion of the kernel API to gain access to unexported symbols. At
> >>> least for kprobes, the correct way to do this is to specify the probe
> >>> point by symbol and offset.
> >>>
> >>> This patch converts systemtap to use the correct kprobe
> >>> symbol_name/offset pair to identify the probe location.
> >> Hi James,
> >>
> >> I think your suggestion is a good step. Of course, it might
> >> have to solve some issues.
> >>
> >> Unfortunately, current kprobe's symbol_name interface is not
> >> so clever. For example, if you specify a static function
> >> which is defined at several places in the kernel(ex. do_open),
> >> it always pick up the first one in kallsyms, even if systemtap
> >> can find all of those functions.
> >> (you can find many duplicated symbols in /proc/kallsyms)
> >
> > Right, but realistically only functions which have a strict existence
> > (i.e. those for whom an address could be taken) can be used; functions
> > which are fully inlined (as in have no separate existence) can't.
> > That's why the patch finds the closest function with an address to match
> > on.
>
> Sure, inlined functions are embedded in a caller function, so the
> closest function is the correct owner.
>
> However, I meant local-scope functions can have same name if
> they are defined in different scope. And even though, both of
> them are shown in kallsyms. This mean, you can see the functions
> which have real different existence but have same symbol.
>
> Would you mean systemtap should not probe those name-conflicted
> functions?

Actually, I wasn't aware we had any.

> >> So, we might better improve kallsyms to treat this case
> >> and find what is a better way to specify symbols and addresses.
> >
> > Well, both the dwarf and the kallsyms know which are the functions that
> > have a real existence, so the tool can work it out. It has a real
> > meaning too because the chosen symbol must be the parent routine of all
> > the nested inlines.
>
> Hmm, here is what I got with your patch;
> $ stap --kelf -e 'probe kernel.function("do_open"){}' -p2
> # probes
> kernel.function("do_open@arch/x86/kernel/apm_32.c:1557") /* pc=<do_open+0x0> */ /* <- kernel.function("do_open") */
> kernel.function("do_open@fs/block_dev.c:928") /* pc=<do_open+0x0> */ /* <- kernel.function("do_open") */
> kernel.function("do_open@fs/nfsctl.c:24") /* pc=<sys_nfsservctl+0x55> */ /* <- kernel.function("do_open") */
> kernel.function("do_open@ipc/mqueue.c:630") /* pc=<do_open+0x0> */ /* <- kernel.function("do_open") */
>
> Without your patch;
> $ stap -e 'probe kernel.function("do_open"){}' -p2
> # probes
> kernel.function("do_open@arch/x86/kernel/apm_32.c:1557") /* pc=0x10382 */ /* <- kernel.function("do_open") */
> kernel.function("do_open@fs/block_dev.c:928") /* pc=0xa0750 */ /* <- kernel.function("do_open") */
> kernel.function("do_open@fs/nfsctl.c:24") /* pc=0xa6411 */ /* <- kernel.function("do_open") */
> kernel.function("do_open@ipc/mqueue.c:630") /* pc=0xc55a6 */ /* <- kernel.function("do_open") */
>
> Obviously, the 3rd "do_open" is fully inlined, so it can be
> correctly handled by kprobes, because it has different
> symbol(sys_nfsservctl). However, other "do_open" have
> same symbol(do_open). these will be put on same
> address (at the first symbol on kallsyms list).
>
> So, we need a bridge for the gap of function addresses
> between kallsyms and dwarf.

You mean this particular problem:

hobholes:/home/jejb/git/BUILD-2.6# grep do_open /proc/kallsyms
c01af160 t do_open
c01d5d40 t do_open

It's certainly a material defect in the current API. I'll think about
it and see if I can come up with a solution.

> [...]
> >> Could we provide a separated GPL'd interface to access named global
> >> symbols which is based on kallsyms?
> >
> > Yes, I think so ... it's just a case of working out what and how; but to
> > do that we need a consumer of the interface.
>
> I agree with you, we need to change both of systemtap and kernel.
>
> Thank you,

You're welcome.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/