Re: [PATCH -v5 00/17] Rewrite x86/ftrace to use text_poke (and more)

From: Alexei Starovoitov
Date: Mon Nov 11 2019 - 14:47:38 EST


On Mon, Nov 11, 2019 at 02:12:52PM +0100, Peter Zijlstra wrote:
> Ftrace is one of the last W^X violators (after this only KLP is left). These
> patches move it over to the generic text_poke() interface and thereby get rid
> of this oddity.
>
> The first 14 patches are the same as in the -v4 posting. The last 3 patches are
> new.
>
> Will, patch 13, arm/ftrace, is unchanged. This is because this way it preserves
> behaviour, but if you can provide me a tested-by for the simpler variant I can
> drop that in.
>
> Patch 15 reworks ftrace's event_create_dir(), which ran module code before the
> module was finished loading (before we even applied jump_labels and all that).
>
> Patch 16 and 17 address minor review feedback.
>
> Ingo, Alexei wants patch #1 for some BPF stuff, can he get that in a topic branch?

Thanks Peter!
Much appreciate it.

I've re-tested the patch 1 alone (it seems to be exactly the same as you posted
it originally back on Aug 27 and then on Oct 7). And now I tested my stuff with
this whole set. No conflicts. Feel free to add to patch 1 alone or the whole set:
Acked-by: Alexei Starovoitov <ast@xxxxxxxxxx>
Tested-by: Alexei Starovoitov <ast@xxxxxxxxxx>
Some of the patches I think are split too fine. I would have combined them, but
we try hard to limit our sets to less than fifteen in bpf/netdev land fwiw.

It was a poor judgment on my side to use text_poke() in my patch (to avoid
explicit dependency on your patch) and not mention the obvious race in the
commit log and intended fix when trees converge:
case BPF_MOD_CALL_TO_CALL:
if (memcmp(ip, old_insn, X86_CALL_SIZE))
goto out;
- text_poke(ip, new_insn, X86_CALL_SIZE);
+ text_poke_bp(ip, new_insn, X86_CALL_SIZE, NULL);
break;

To avoid the issue in the first place the best is to have your 1st patch in tip
and bpf-next/net-next trees. We had "the same patch in multiple trees"
situation in the past and git did the right thing during the merge window. So I
don't anticipate any issues this time around.

One more question.
What is the reason you stick to int3 style poking when 8 byte write is atomic?
Can text_poke() patch nop5 by combining the call/jmp5 insn with extra 3 bytes
after the nop and write 8 ?