Re: [PATCH] tracing: Trace instrumentation begin and end

From: Thomas Gleixner
Date: Wed Mar 22 2023 - 07:19:27 EST


Steven!

On Tue, Mar 21 2023 at 21:51, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" <rostedt@xxxxxxxxxxx>
> produces:
>
> 2) 0.764 us | exit_to_user_mode_prepare();
> 2) | /* page_fault_user: address=0x7fadaba40fd8 ip=0x7fadaba40fd8 error_code=0x14 */
> 2) 0.581 us | down_read_trylock();
>
> The "page_fault_user" event is not encapsulated around any function, which
> means it probably triggered and went back to user space without any trace
> to know how long that page fault took (the down_read_trylock() is likely to
> be part of the page fault function, but that's besides the point).
>
> To help bring back the old functionality, two trace points are added. One
> just after instrumentation begins, and one just before it ends. This way,
> we can see all the time that the kernel can do something meaningful, and we
> will trace it.

Seriously? That's completely insane. Have you actually looked how many
instrumentation_begin()/end() pairs are in the affected code pathes?

Obviously not. It's a total of _five_ for every syscall and at least
_four_ for every interrupt/exception from user mode.

The number #1 design rule for instrumentation is to be as non-intrusive as
possible and not to be as lazy as possible.

instrumentation_begin()/end() is solely meant for objtool validation and
nothing else.

There are clearly less horrible ways to retrieve the #PF duration, no?

Thanks,

tglx