Re: [PATCH 3/6] x86/kvm/emulate: Avoid RET for fastops
From: Peter Zijlstra
Date: Wed Apr 16 2025 - 04:39:31 EST
On Tue, Apr 15, 2025 at 07:39:41AM -0700, Josh Poimboeuf wrote:
> On Tue, Apr 15, 2025 at 09:44:21AM +0200, Peter Zijlstra wrote:
> > On Mon, Apr 14, 2025 at 03:36:50PM -0700, Josh Poimboeuf wrote:
> > > On Mon, Apr 14, 2025 at 01:11:43PM +0200, Peter Zijlstra wrote:
> > > > Since there is only a single fastop() function, convert the FASTOP
> > > > stuff from CALL_NOSPEC+RET to JMP_NOSPEC+JMP, avoiding the return
> > > > thunks and all that jazz.
> > > >
> > > > Specifically FASTOPs rely on the return thunk to preserve EFLAGS,
> > > > which not all of them can trivially do (call depth tracing suffers
> > > > here).
> > > >
> > > > Objtool strenuously complains about things, therefore fix up the
> > > > various problems:
> > > >
> > > > - indirect call without a .rodata, fails to determine JUMP_TABLE,
> > > > add an annotation for this.
> > > > - fastop functions fall through, create an exception for this case
> > > > - unreachable instruction after fastop_return, save/restore
> > >
> > > I think this breaks unwinding. Each of the individual fastops inherits
> > > fastop()'s stack but the ORC doesn't reflect that.
> >
> > I'm not sure I understand. There is only the one location, and we
> > simply save/restore the state around the one 'call'.
>
> The problem isn't fastop() but rather the tiny functions it "calls".
> Each of those is marked STT_FUNC so it gets its own ORC data saying the
> return address is at RSP+8.
>
> Changing from CALL_NOSPEC+RET to JMP_NOSPEC+JMP means the return address
> isn't pushed before the branch. Thus they become part of fastop()
> rather than separate functions. RSP+8 is only correct if it happens to
> have not pushed anything to the stack before the indirect JMP.
Yeah, I finally got there. I'll go cook up something else.