Re: [PATCH] riscv: kprobe: Optimize kprobe with accurate atomicity

From: Guo Ren
Date: Fri Jan 27 2023 - 23:46:07 EST


On Sat, Jan 28, 2023 at 11:53 AM liaochang (A) <liaochang1@xxxxxxxxxx> wrote:
>
> Hi, Guo Ren
>
> 在 2023/1/27 0:15, guoren@xxxxxxxxxx 写道:
> > From: Guo Ren <guoren@xxxxxxxxxxxxxxxxx>
> >
> > The previous implementation was based on the stop_matchine mechanism,
> > which reduced the speed of arm/disarm_kprobe. Using minimum ebreak
> > instruction would get accurate atomicity.
> >
> > This patch removes the patch_text of riscv, which is based on
> > stop_machine. Then riscv only reserved patch_text_nosync, and developers
> > need to be more careful in dealing with patch_text atomicity.
>
> In the serie of RISCV OPTPROBES [1], it patches a long-jump instructions pair
> AUIPC/JALR in kernel text, so in order to ensure other CPUs does not execute
> in the instructions that will be modified, it is still need to stop other CPUs
> via patch_text API, or you have any better solution to achieve the purpose?
- The stop_machine is an expensive way all architectures should
avoid, and you could keep that in your OPTPROBES implementation files
with static functions.
- The stop_machine couldn't work with PREEMPTION, so your
implementation needs to work with !PREEMPTION.

>
> Thanks.
>
> >
> > When CONFIG_RISCV_ISA_C=n, the ebreak could replace the whole
> > instruction. When CONFIG_RISCV_ISA_C=y, the patch uses 16-bit length
> > c.ebreak instruction, which may occupy the first part of the 32-bit
> > instruction and leave half the rest of the broken instruction. Because
> > ebreak could detour the flow to skip it, leaving it in the kernel text
> > memory is okay.
> >
> > Signed-off-by: Guo Ren <guoren@xxxxxxxxxxxxxxxxx>
> > Signed-off-by: Guo Ren <guoren@xxxxxxxxxx>
> > ---
> > arch/riscv/include/asm/patch.h | 1 -
> > arch/riscv/kernel/patch.c | 33 ------------------------------
> > arch/riscv/kernel/probes/kprobes.c | 29 ++++++++++++++++++--------
> > 3 files changed, 21 insertions(+), 42 deletions(-)
> >
> > diff --git a/arch/riscv/include/asm/patch.h b/arch/riscv/include/asm/patch.h
> > index 9a7d7346001e..2500782e6f5b 100644
> > --- a/arch/riscv/include/asm/patch.h
> > +++ b/arch/riscv/include/asm/patch.h
> > @@ -7,6 +7,5 @@
> > #define _ASM_RISCV_PATCH_H
> >
> > int patch_text_nosync(void *addr, const void *insns, size_t len);
> > -int patch_text(void *addr, u32 insn);
> >
> > #endif /* _ASM_RISCV_PATCH_H */
> > diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c
> > index 765004b60513..8bd51ed8b806 100644
> > --- a/arch/riscv/kernel/patch.c
> > +++ b/arch/riscv/kernel/patch.c
> > @@ -98,36 +98,3 @@ int patch_text_nosync(void *addr, const void *insns, size_t len)
> > return ret;
> > }
> > NOKPROBE_SYMBOL(patch_text_nosync);
> > -
> > -static int patch_text_cb(void *data)
> > -{
> > - struct patch_insn *patch = data;
> > - int ret = 0;
> > -
> > - if (atomic_inc_return(&patch->cpu_count) == num_online_cpus()) {
> > - ret =
> > - patch_text_nosync(patch->addr, &patch->insn,
> > - GET_INSN_LENGTH(patch->insn));
> > - atomic_inc(&patch->cpu_count);
> > - } else {
> > - while (atomic_read(&patch->cpu_count) <= num_online_cpus())
> > - cpu_relax();
> > - smp_mb();
> > - }
> > -
> > - return ret;
> > -}
> > -NOKPROBE_SYMBOL(patch_text_cb);
> > -
> > -int patch_text(void *addr, u32 insn)
> > -{
> > - struct patch_insn patch = {
> > - .addr = addr,
> > - .insn = insn,
> > - .cpu_count = ATOMIC_INIT(0),
> > - };
> > -
> > - return stop_machine_cpuslocked(patch_text_cb,
> > - &patch, cpu_online_mask);
> > -}
> > -NOKPROBE_SYMBOL(patch_text);
> > diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
> > index 475989f06d6d..27f8960c321c 100644
> > --- a/arch/riscv/kernel/probes/kprobes.c
> > +++ b/arch/riscv/kernel/probes/kprobes.c
> > @@ -24,12 +24,18 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *);
> > static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
> > {
> > unsigned long offset = GET_INSN_LENGTH(p->opcode);
> > +#ifdef CONFIG_RISCV_ISA_C
> > + u32 opcode = __BUG_INSN_16;
> > +#else
> > + u32 opcode = __BUG_INSN_32;
> > +#endif
> >
> > p->ainsn.api.restore = (unsigned long)p->addr + offset;
> >
> > - patch_text(p->ainsn.api.insn, p->opcode);
> > - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset),
> > - __BUG_INSN_32);
> > + patch_text_nosync(p->ainsn.api.insn, &p->opcode, offset);
> > + patch_text_nosync((void *)((unsigned long)(p->ainsn.api.insn) + offset),
> > + &opcode, GET_INSN_LENGTH(opcode));
> > +
>
> I have submit a similar optimization for patching single-step slot [2].
> And it is indeed safe to use compact breakpoint in single-step slot no matter
> what type of patched instruction is.
Keep the same c.ebreak style in CONFIG_RISCV_ISA_C. It's my design principle.

>
> Thanks.
>
> > }
> >
> > static void __kprobes arch_prepare_simulate(struct kprobe *p)
> > @@ -114,16 +120,23 @@ void *alloc_insn_page(void)
> > /* install breakpoint in text */
> > void __kprobes arch_arm_kprobe(struct kprobe *p)
> > {
> > - if ((p->opcode & __INSN_LENGTH_MASK) == __INSN_LENGTH_32)
> > - patch_text(p->addr, __BUG_INSN_32);
> > - else
> > - patch_text(p->addr, __BUG_INSN_16);
> > +#ifdef CONFIG_RISCV_ISA_C
> > + u32 opcode = __BUG_INSN_16;
> > +#else
> > + u32 opcode = __BUG_INSN_32;
> > +#endif
> > + patch_text_nosync(p->addr, &opcode, GET_INSN_LENGTH(opcode));
>
> Sounds good, but it will leave some RVI instruction truncated in kernel text,
> i doubt kernel behavior depends on the rest of the truncated instruction, well,
> it needs more strict testing to prove my concern :)
I do this on purpose, and it doesn't cause any problems. Don't worry;
IFU hw must enforce the fetch sequence, and there is no way to execute
broken instructions even in the speculative execution path.

>
> > }
> >
> > /* remove breakpoint from text */
> > void __kprobes arch_disarm_kprobe(struct kprobe *p)
> > {
> > - patch_text(p->addr, p->opcode);
> > +#ifdef CONFIG_RISCV_ISA_C
> > + u32 opcode = __BUG_INSN_16;
> > +#else
> > + u32 opcode = __BUG_INSN_32;
> > +#endif
> > + patch_text_nosync(p->addr, &p->opcode, GET_INSN_LENGTH(opcode));
> > }
> >
> > void __kprobes arch_remove_kprobe(struct kprobe *p)
>
> [1] - https://lore.kernel.org/lkml/20230127130541.1250865-9-chenguokai17@xxxxxxxxxxxxxxxx/
> [2] - https://lore.kernel.org/lkml/20220927022435.129965-1-liaochang1@xxxxxxxxxx/T/
>
> --
> BR,
> Liao, Chang



--
Best Regards
Guo Ren