Re: [PATCH 3/4] x86: open-code register save/restore in trace_hardirqs thunks

From: Andy Lutomirski
Date: Sat Jan 10 2015 - 16:02:45 EST


On Sat, Jan 10, 2015 at 12:42 PM, Borislav Petkov <bp@xxxxxxxxx> wrote:
> On Sat, Jan 10, 2015 at 12:17:13PM -0800, Andy Lutomirski wrote:
>> I asked this once, and someone told me that push/pop has lower
>> throughput. I find this surprising.
>
> Implicit dependency on %rsp probably. The MOVs allow you to start more
> stuff out-of-order I'd guess...

AIUI modern CPUs have fancy stack engines that match call/ret pairs,
and presumably they can speculate rsp values across multiple pushes
and pops very quickly.

Also, don't compilers generally use push and pop to save and restore
callee-saved registers? I think that function calls are common enough
that the CPU vendors would have made these sequences fast.

--Andy

>
>> Tt could be worth adding a macro along the lines of pushq_cfi_save
>> that does the pushq_cfi and the CFI_REL_OFFSET.
>
> Yep, for balance.
>
> --
> Regards/Gruss,
> Boris.
>
> Sent from a fat crate under my desk. Formatting is fine.
> --



--
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/