Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

From: Ingo Molnar
Date: Tue Jan 23 2018 - 05:44:40 EST



* David Woodhouse <dwmw2@xxxxxxxxxxxxx> wrote:

> On Tue, 2018-01-23 at 11:15 +0100, Ingo Molnar wrote:
> >
> > BTW., the reason this is enabled on all distro kernels is because the overhead
> > is  a single patched-in NOP instruction in the function epilogue, when tracing
> > is  disabled. So it's not even a CALL+RET - it's a patched in NOP.
>
> Hm? We still have GCC emitting 'call __fentry__' don't we? Would be nice to get
> to the point where we can patch *that* out into a NOP... or are you saying we
> already can?

Yes, we already can and do patch the 'call __fentry__/ mcount' call site into a
NOP today - all 50,000+ call sites on a typical distro kernel.

We did so for a long time - this is all a well established, working mechanism.

> But this is a digression. I was being pedantic about the "0 cycles" but sure,
> this would be perfectly tolerable.

It's not a digression in two ways:

- I wanted to make it clear that for distro kernels it _is_ a zero cycles overhead
mechanism for non-SkyLake CPUs, literally.

- I noticed that Meltdown and the CR3 writes for PTI appears to have established a
kind of ... insensitivity and numbness to kernel micro-costs, which peaked with
the per-syscall MSR write nonsense patch of the SkyLake workaround.
That attitude is totally unacceptable to me as x86 maintainer and yes, still
every cycle counts.

Thanks,

Ingo