Re: [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 functioncalls

From: H. Peter Anvin
Date: Thu Jul 19 2012 - 19:07:32 EST


On 07/19/2012 04:04 PM, Steven Rostedt wrote:
> On Thu, 2012-07-19 at 15:53 -0700, H. Peter Anvin wrote:
>
>> lea is not typically faster than add, but in the case of Atom, it is
>> done in an earlier pipeline stage (AGU instead of ALU) which means lea
>> is faster if its inputs are already available as address expressions and
>> is consumed by address expressions; the goal is to avoid the ALU->AGU
>> forwarding latency.
>
> Well, the question is, which is faster:
>
> lea 8(%esp), %esp
> addl $8, %esp
>
> Basically, all we want to do is add 8 to the stack pointer. And this is
> for the x86_32 version of whatever hardware is in use.
>

What I'm telling you is that it depends on the context.

An address expression needs to be ready in the AGU; a piece of data
comes from the ALU. Whenever something moves from the ALU to the AGU,
there is a penalty. There is no penalty to move from the AGU to the
ALU, since the ALU is in a later stage.

I *believe* the stack adjustments push/pop are done in the AGU, but I
have to double-check.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/