Re: [PATCH] tracing: Trace instrumentation begin and end

From: Peter Zijlstra
Date: Wed Mar 22 2023 - 08:53:25 EST


On Wed, Mar 22, 2023 at 08:48:34AM -0400, Steven Rostedt wrote:
> On Wed, 22 Mar 2023 12:19:14 +0100
> Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> > Steven!
> >
> > On Tue, Mar 21 2023 at 21:51, Steven Rostedt wrote:
> > > From: "Steven Rostedt (VMware)" <rostedt@xxxxxxxxxxx>
> > > produces:
> > >
> > > 2) 0.764 us | exit_to_user_mode_prepare();
> > > 2) | /* page_fault_user: address=0x7fadaba40fd8 ip=0x7fadaba40fd8 error_code=0x14 */
> > > 2) 0.581 us | down_read_trylock();
> > >
> > > The "page_fault_user" event is not encapsulated around any function, which
> > > means it probably triggered and went back to user space without any trace
> > > to know how long that page fault took (the down_read_trylock() is likely to
> > > be part of the page fault function, but that's besides the point).
> > >
> > > To help bring back the old functionality, two trace points are added. One
> > > just after instrumentation begins, and one just before it ends. This way,
> > > we can see all the time that the kernel can do something meaningful, and we
> > > will trace it.
> >
> > Seriously? That's completely insane. Have you actually looked how many
> > instrumentation_begin()/end() pairs are in the affected code pathes?
> >
> > Obviously not. It's a total of _five_ for every syscall and at least
> > _four_ for every interrupt/exception from user mode.
> >
> > The number #1 design rule for instrumentation is to be as non-intrusive as
> > possible and not to be as lazy as possible.
>
> And it still is. It still uses nops when not enabled. I could even add a
> config to only have this compiled in when the config is set, so that
> production can disable it if it wants to.
>
> Just in case it's not obvious:
>
> if (tracepoint_enabled(instrumentation_begin))
> call_trace_instrumentation_begin(ip, pip);
>
> is equivalent to:
>
> if (static_key_false(&__tracepoint_instrumentation_begin.key))
> call_trace_instrumentation_begin(ip, pip);
>

It is still completely insane.

NAK.