Re: [RFC][BUG] tracer: Fails to work

From: Mathieu Desnoyers
Date: Thu Jan 28 2016 - 08:38:15 EST


----- On Jan 28, 2016, at 3:08 AM, Peter Zijlstra peterz@xxxxxxxxxxxxx wrote:

> Hi Steve,
>
> So I was hunting wabbits the other day and ftrace failed to work.
>
> After I cursed a bit on IRC, Thomas found the below, after 'fixing' it
> like so things worked enough to get the trace out.
>
> Relying on things like this make it entirely impossible to get any trace
> data out if you've wedged a CPU. Exactly the kind of situation you want
> trace data for.
>
> Please consider an appropriate change to make this happen.

I wonder if we should start considering using SRCU to protect
tracepoint (and other instrumentation mechanisms) critical sections
rather than RCU-sched ?

SRCU would allow us to wait for a grace-period specifically targeting
tracing, which should increase tracer robustness in face of misbehaving
CPUs. It would also allow us to do blocking calls (e.g. get_user())
from syscall entry/exit tracing, which I've been wanting to do for a
while.

Thoughts ?

Thanks,

Mathieu

>
> ---
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 95181e36891a..b09c5b955555 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -4053,7 +4053,7 @@ EXPORT_SYMBOL_GPL(ring_buffer_read_prepare);
> void
> ring_buffer_read_prepare_sync(void)
> {
> - synchronize_sched();
> +// synchronize_sched();
> }
> EXPORT_SYMBOL_GPL(ring_buffer_read_prepare_sync);

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com