Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

From: Ingo Molnar
Date: Tue Jan 23 2018 - 05:23:43 EST



* David Woodhouse <dwmw2@xxxxxxxxxxxxx> wrote:

> > On SkyLake this would add an overhead of maybe 2-3 cycles per function call and 
> > obviously all this code and data would be very cache hot. Given that the average 
> > number of function calls per system call is around a dozen, this would be _much_ 
> > faster than any microcode/MSR based approach.
>
> That's kind of neat, except you don't want it at the top of the
> function; you want it at the bottom.
>
> If you could hijack the *return* site, then you could check for
> underflow and stuff the RSB right there. But in __fentry__ there's not
> a lot you can do other than complain that something bad is going to
> happen in the future. You know that a string of 16+ rets is going to
> happen, but you've got no gadget in *there* to deal with it when it
> does.

No, it can be done with the existing CALL instrumentation callback that
CONFIG_DYNAMIC_FTRACE=y provides, by pushing a RET trampoline on the stack from
the CALL trampoline - see my previous email.

> HJ did have patches to turn 'ret' into a form of retpoline, which I
> don't think ever even got performance-tested.

Return instrumentation is possible as well, but there are two major drawbacks:

- GCC support for it is not as widely available and return instrumentation is
less tested in Linux kernel contexts

- a major point of my suggestion is that CONFIG_DYNAMIC_FTRACE=y is already
enabled in distros here and today, so the runtime overhead to non-SkyLake CPUs
would be literally zero, while still allowing to fix the RSB vulnerability on
SkyLake.

Thanks,

Ingo