Re: [PATCH] x86,retpoline: Be sure to emit INT3 after JMP *%\reg

From: Alexei Starovoitov
Date: Fri Sep 09 2022 - 10:20:11 EST


On Fri, Sep 9, 2022 at 1:16 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Thu, Sep 08, 2022 at 07:01:12AM -0700, Alexei Starovoitov wrote:
>
> > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > > index c1f6c1c51d99..37f821dee68f 100644
> > > --- a/arch/x86/net/bpf_jit_comp.c
> > > +++ b/arch/x86/net/bpf_jit_comp.c
> > > @@ -419,7 +419,8 @@ static void emit_indirect_jump(u8 **pprog, int reg, u8 *ip)
> > > OPTIMIZER_HIDE_VAR(reg);
> > > emit_jump(&prog, &__x86_indirect_thunk_array[reg], ip);
> > > } else {
> > > - EMIT2(0xFF, 0xE0 + reg);
> > > + EMIT2(0xFF, 0xE0 + reg); /* jmp *%\reg */
> > > + EMIT1(0xCC); /* int3 */
> >
> > Hmm. Why is this unconditional?
> > Shouldn't it be guarded with CONFIG_xx or cpu_feature_enabled ?
> > People that don't care about hw speculation vulnerabilities
> > shouldn't pay the price of increased code size.
>
> Sure, like so then?
>
> ---
> Subject: x86,retpoline: Be sure to emit INT3 after JMP *%\reg
> From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Date: Thu, 8 Sep 2022 12:04:50 +0200
>
> Both AMD and Intel recommend using INT3 after an indirect JMP. Make sure
> to emit one when rewriting the retpoline JMP irrespective of compiler
> SLS options or even CONFIG_SLS.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> ---
>
> arch/x86/kernel/alternative.c | 9 +++++++++
> arch/x86/net/bpf_jit_comp.c | 4 +++-
> 2 files changed, 12 insertions(+), 1 deletion(-)
>
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -453,6 +453,15 @@ static int patch_retpoline(void *addr, s
> return ret;
> i += ret;
>
> + /*
> + * The compiler is supposed to EMIT an INT3 after every unconditional
> + * JMP instruction due to AMD BTC. However, if the compiler is too old
> + * or SLS isn't enabled, we still need an INT3 after indirect JMPs
> + * even on Intel.
> + */
> + if (op == JMP32_INSN_OPCODE && i < insn->length)
> + bytes[i++] = INT3_INSN_OPCODE;
> +
> for (; i < insn->length;)
> bytes[i++] = BYTES_NOP1;
>
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -419,7 +419,9 @@ static void emit_indirect_jump(u8 **ppro
> OPTIMIZER_HIDE_VAR(reg);
> emit_jump(&prog, &__x86_indirect_thunk_array[reg], ip);
> } else {
> - EMIT2(0xFF, 0xE0 + reg);
> + EMIT2(0xFF, 0xE0 + reg); /* jmp *%\reg */
> + if (IS_ENABLED(CONFIG_RETPOLINE) || IS_ENABLED(CONFIG_SLS))
> + EMIT1(0xCC); /* int3 */

Looks better. Ack.