Re: Efficient x86 and x86_64 NOP microbenchmarks

From: Steven Rostedt
Date: Fri Aug 15 2008 - 17:34:51 EST



[ Finally got my goodmis email back ]

On Wed, 13 Aug 2008, Andi Kleen wrote:

> > Sorry to ask, I feel I must be missing something, but I'm trying to
> > figure out where you propose to add the "call mcount" ? In the caller or
> > in the callee ?
>
> callee like gcc. caller would be likely more bloated because
> there are more calls than functions. Also if it was at the
> callee more code would be needed because the function currently
> executed couldn't be gotten from stack directly.
>
> > Or is it a different scheme I don't see ? I am trying to figure out how
> > you happen to do all that without dynamic code modification and manage
> > not to hurt performance.
>
> The dynamic code modification is only needed because there is no
> global table of the mcount call sites. So instead it discovers
> them at runtime, but that requires runtime save patching

The new code does not discover the places at runtime. The old code did
that. The "to kill a daemon" removed the runtime discovery and replaced it
with discovery at compile time.

>
> With a custom call scheme one could just build up a table of
> call sites at link time using an ELF section and then when
> tracing is enabled/disabled always patch them all in one go
> in a stop_machine(). Then you wouldn't need parallel execution safe
> patching anymore and it doesn't matter what the nops look like.

The current patch set, pretty much does exactly this. Yes, I patch
at boot up all in one go, before the other CPUS are even active.
This takes all of 6 milliseconds to do. Not much extra time for bootup.

>
> The other advantage is that it would allow getting rid of
> the frame pointer.

This is the only advantage that you have.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/