Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

From: Andy Lutomirski
Date: Thu Nov 29 2018 - 17:25:51 EST


On Thu, Nov 29, 2018 at 2:22 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Thu, Nov 29, 2018 at 04:14:46PM -0600, Josh Poimboeuf wrote:
> > On Thu, Nov 29, 2018 at 11:01:48PM +0100, Peter Zijlstra wrote:
> > > On Thu, Nov 29, 2018 at 11:10:50AM -0600, Josh Poimboeuf wrote:
> > > > On Thu, Nov 29, 2018 at 08:59:31AM -0800, Andy Lutomirski wrote:
> > >
> > > > > (like pointing IP at a stub that retpolines to the target by reading
> > > > > the function pointer, a la the unoptimizable version), then okay, I
> > > > > guess, with only a small amount of grumbling.
> > > >
> > > > I tried that in v2, but Peter pointed out it's racy:
> > > >
> > > > https://lkml.kernel.org/r/20181126160217.GR2113@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> > >
> > > Ah, but that is because it is a global shared trampoline.
> > >
> > > Each static_call has it's own trampoline; which currently reads
> > > something like:
> > >
> > > RETPOLINE_SAFE
> > > JMP *key
> > >
> > > which you then 'defuse' by writing an UD2 on. _However_, if you write
> > > that trampoline like:
> > >
> > > 1: RETPOLINE_SAFE
> > > JMP *key
> > > 2: CALL_NOSPEC *key
> > > RET
> > >
> > > and have the text_poke_bp() handler jump to 2 (a location you'll never
> > > reach when you enter at 1), it will in fact work I think. The trampoline
> > > is never modified and not shared between different static_call's.
> >
> > But after returning from the function to the trampoline, how does it
> > return from the trampoline to the call site? At that point there is no
> > return address on the stack.
>
> Oh, right, so that RET don't work. ARGH. Time to go sleep I suppose.

I assume I'm missing something, but can't it just be JMP_NOSPEC *key?
The code would call the trampoline just like any other function and,
if the alignment is bad, we can skip patching it. And, if we want the
performance back, maybe some day we can find a clean way to patch
those misaligned callers, too.