Re: [RFC PATCH 01/11] x86: kernel FineIBT

From: Peter Zijlstra
Date: Wed May 04 2022 - 14:32:11 EST


On Wed, May 04, 2022 at 10:04:02AM -0700, Peter Collingbourne wrote:
> On Wed, May 04, 2022 at 12:20:19PM +0200, Peter Zijlstra wrote:
> > On Tue, May 03, 2022 at 03:02:44PM -0700, Josh Poimboeuf wrote:
> >
> > > I'm not really qualified to comment on this too directly since I haven't
> > > looked very much at the variations on FineIBT/CFI/KCFI, and what the
> > > protections and drawbacks are for each approach, and when it might even
> > > make sense to combine them for a "paranoid user".
> > >
> > > Since we have multiple similar and possibly competing technologies being
> > > discussed, one thing I do want to warn against is that we as kernel
> > > developers tend to err on the side of giving people too many choices and
> > > combinations which *never* get used.
> >
> > So I don't think there's going to be a user choice here. If there's
> > hardware support, FineIBT makes more sense. That also means that kCFI no
> > longer needs to worry about IBT.
> >
> > If we do something like:
> >
> >
> > kCFI FineIBT
> >
> > __cfi_\sym: __cfi_\sym:
> > endbr # 4 endbr # 4
> > sub $hash, %r10 # 7 sub $hash, %r10 # 7
> > je \sym # 2 je \sym # 2
> > ud2 # 2 ud2 # 2
> > \sym: \sym:
> >
> >
> > caller: caller:
> > cmpl $hash, -8(%r11) # 8 movl $hash, %r10d # 6
> > je 1f # 2 sub 15, %r11 # 4
> > ud2 # 2 call *%r11 # 3
> > 1: call __x86_indirect_thunk_r11 # 5 .nop 4 # 4 (could even fix up r11 again)
> >
> >
> > Then, all that's required is a slight tweak to apply_retpolines() to
> > rewrite a little more text.
> >
> > Note that this also does away with having to fix up the linker, since
> > all direct call will already point at \sym. It's just the IBT indirect
> > calls that need to frob the pointer in order to hit the ENDBR.
> >
> > On top of that, we no longer have to special case the objtool
> > instruction decoder, the prelude are proper instructions now.
>
> For kCFI this brings back the gadget problem that I mentioned here:
> https://lore.kernel.org/all/Yh7fLRYl8KgMcOe5@xxxxxxxxxx/
>
> because the hash at the call site is 8 bytes before the call
> instruction.

Damn, I forgot about that. Too subtle :-/

So Joao had another crazy idea, lemme diagram that to see if it works.

(sorry I inverted the order by accident)


FineIBT kCFI

__fineibt_\hash:
xor \hash, %r10 # 7
jz 1f # 2
ud2 # 2
1: ret # 1
int3 # 1


__cfi_\sym: __cfi_\sym:
int3; int3 # 2
endbr # 4 mov \hash, %eax # 5
call __fineibt_\hash # 5 int3; int3 # 2
\sym: \sym:
... ...


caller: caller:
movl \hash, %r10d # 6 cmpl \hash, -6(%r11) # 8
sub $9, %r11 # 4 je 1f # 2
call *%r11 # 3 ud2 # 2
.nop 4 # 4 (or fixup r11) call __x86_indirect_thunk_r11 # 5


This way we also need to patch the __cfi_\sym contents, but we get a
little extra room to place the constant for kCFI in a suitable location.

It seems to preserve the properties of the last one in that direct calls
will already be correct and we don't need linker fixups, and objtool can
simply parse the preamble as regular instructions without needing
further help.