Re: [PATCH] tracing: Trace instrumentation begin and end

From: Thomas Gleixner
Date: Wed Mar 22 2023 - 18:03:43 EST


Steven!

On Wed, Mar 22 2023 at 14:16, Steven Rostedt wrote:
> On Wed, 22 Mar 2023 16:39:41 +0100
> Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
>> But for figuring out how long a syscall, interrupt or exception takes
>> there are exactly TWO tracepoints required, not 10 or 8. And it's bloody
>> obvious where to place them, right?
>
> Not always, and it's pretty much always architecture dependent. I was
> looking for an architecture agnostic approach, as I imagine all archs will
> be eventually using this.

All architectures, at least those which matter, will eventually use the
generic entry code, which is completely architecture agnostic.

And again, while instrumentation_begin/end() might be in use on more
architectures today, it's still the fundamentally wrong place to stick a
tracepoint into. See below.

> Things could have changed since then. But if adding trace events for the
> missing syscalls and around exceptions for timing purposes is OK, then I'm
> happy to go that approach.

If there are tracepoints missing for syscalls then the obvious correct
thing is to add them so that the syscall coverage is complete instead of
abusing the lack of tracepoints as an argument for horrible hackery.

If there is a value to have tracepoints around exceptions etc. for
timing purposes then they definitely make more sense than the abuse of
instrumentation_begin/end() which just generates noise and adds overhead
into completely unrelated code pathes.

>> > And the more we add noinstr, the more the kernel becomes a black box
>> > again.
>>
>> It does not. noinstr is a technical requirement to keep instrumentation
>> out of code pathes which are not able to handle instrumentation. You
>> know that very well, so please stop this theatrical handwaving and come
>> back if you have sensible technical arguments.
>
> I never said nor implied that it's not important. I'm just concerned that
> we currently have no way to see when it's happening.

There is no value to see the instrumentation_begin()/end() annotations
in tracing, really.

They are annotations to enable objtool to validate that the noinstr
constraints are not violated. We need quite a few of them as the tooling
operates at function scope. So a function which is invoked from noinstr
low level code looks like this:

noinstr foo(...)
{
enter();

instrumentation_begin(); #2
do_other_stuff();
instrumentation_end(); #2

exit();
}

enter()/exit() need to be noinstr functions as well, but as enter()
might be the function which actually establishes the context,
e.g. __enter_from_user_mode(), it can contain calls into instrumentable
code too:

noinstr enter(..)
{
enter_context();

instrumentation_begin(); #1
do_instrumentable_enter();
instrumentation_end(); #1
}

See? So you end up with two pairs of instrumentation_begin()/end() in
this simple example. Add exit() to the picture:

noinstr exit(..)
{
instrumentation_begin(); #3
do_instrumentable_exit();
instrumentation_end(); #3

exit_context();
}

So you have already three.

In reality it's five pairs per syscall and at least four pairs per
exception/interrupt.

As you can see above end() #1 and begin() #2 are identical from a
tracing POV. So are end() #2 and begin() #3.

IOW there is zero value to have tracepoints in them. Especially as
do_instrumentable_enter()/exit() and do_other_stuff() are subject to
tracing already unless the compiler decides to inline them.

The only relevant information is begin() #1 and end() #3, right?

And as you cannot piggy-pack that on instrumentation_begin()/end() for
obvious reasons, you need explicit tracepoints, which are in the very
end more informative and come with a factor of 4-5 less overhead in both
the enabled and disabled case.

Contrary to popular believe, disabled tracepoints are not completely
free of overhead either.

Thanks,

tglx