Re: [PATCH] Identify which executable object the userspace addressbelongs to. Store thread group leader id, and use it to lookup the addressin the process's map. We could have looked up the address on thread's map,but the thread might not exist by the time we are called. The process mightnot exist either, but if you are reading trace_pipe, that is unlikely.

From: Török Edwin
Date: Mon Nov 03 2008 - 08:58:05 EST

Next message: Pengfei Hu: "Re: [PATCH] 2.6.27: add a kernel hacking option to protect kernel memory between different modules"
Previous message: Alexey Dobriyan: "next-20081103: error: asm/ftrace.h: No such file or directory"
In reply to: Ingo Molnar: "Re: [PATCH] Identify which executable object the userspace addressbelongs to. Store thread group leader id, and use it to lookup theaddress in the process's map. We could have looked up the addresson thread's map, but the thread might not exist by the time we arecalled. The process might not exist either, but if you are readingtrace_pipe, that is unlikely."
Next in thread: Ingo Molnar: "Re: [PATCH] Identify which executable object the userspace addressbelongs to. Store thread group leader id, and use it to lookup theaddress in the process's map. We could have looked up the addresson thread's map, but the thread might not exist by the time we arecalled. The process might not exist either, but if you are readingtrace_pipe, that is unlikely."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 2008-11-03 10:29, Ingo Molnar wrote:
> * Török Edwin <edwintorok@xxxxxxxxx> wrote:
>
>
>>> Your patches are a nice feature we want to have nevertheless - to
>>> be able to see where a user-space app is running has been one of
>>> the historically weak points of kernel instrumentation.
>>>
>> Thanks.
>> It currently works for x86 only, but architecture porters can add
>> support for theirs quite easily, it just needs to modeled after how
>> oprofile does it for example.
>> BTW would it make sense to change oprofile and the sysprof tracer to use
>> save_stack_trace_user? It would eliminate some code duplication.
>>
>
> that definitely sounds like the right direction. I've Cc:-ed Robert
> Richter, the Oprofile maintainer - please Cc: him to code that touches
> oprofile.
>
> note that NMI interaction of user-space stackframe walkers can be a
> bit tricky: the basic problem is that if you fetch a user-space
> stackframe that can create a fault

The code in trace_sysprof.c (which I used as a base for the
save_stack_trace_user) disables pagefaults
before reading the stackframe from userspace. Does it avoid this problem
then?

Note that due to its use from ftrace, the userstack walker can be called
from the pagefault handler itself, and if it is
allowed to fault it could lead to some form of deadlock. Are the ftrace
functions protected from recursively reentering themselves?

> , and the IRET at the end of the
> fault handler will re-enable NMIs (violating the NMI code's
> assumptions).
>

Is this already a problem with oprofile's user-stack walker?

> there are patches on lkml written by Mathieu Desnoyers that solve this
> by changing all the fault path to use RET instead of IRET. It might
> make sense to dust them off - we carried them for a long time in -tip
> and they were robust. (they just never had any really strong
> justification and were rather complex - that changes now)
>
> Mathieu, what do you think?
>
>
>> Would it make sense to add a script that post-processes the output
>> to scripts/tracing?
>>
>> It would parse a trace log (from trace or latency_trace) and use
>> addr2line to resolve the address to source:line, and if successful
>> replace the relative address with that; and also group identical
>> stack traces together.
>>
>
> sure, please add it to scripts/tracing/.
>

Ok, will do so in v3.

> The best approach would be if the kernel could output the best info by
> default

The kernel could do some grouping and counting (as latencytop does), but
I don't see where it would fit in frace's infrastructure.

I think ftrace's one entry per event is useful in many situations
(debugging, latency measurements), but if the events occur too frequently
it could produce too much data, and it would be more efficient to do
some counting/grouping of similar info in-kernel before outputting to
userspace.
Perhaps as a layer on top of ftrace? What do you think?

> - but that seems rather hard for addr2line functionality which
> involves debuginfo processing, etc.
>

yes it would be an overkill to try to do that from the kernel, when it
is so easy to do from userspace ;)

Best regards,
--Edwin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/