Re: [PATCH v5] arm64: kprobe: Enable OPTPROBE for arm64

From: Masami Hiramatsu
Date: Thu Jan 13 2022 - 01:04:24 EST


On Wed, 12 Jan 2022 20:21:57 +0800
Jianhua Liu <jianhua.ljh@xxxxxxxxx> wrote:

> On Wed, Jan 12, 2022 at 9:31 AM liuqi (BA) <liuqi115@xxxxxxxxxx> wrote:
> >
> >
> >
> > On 2022/1/4 10:35, Masami Hiramatsu wrote:
> > > Hi Jianhua,
> > >
> > > On Mon, 3 Jan 2022 17:03:33 +0800
> > > Jianhua Liu <jianhua.ljh@xxxxxxxxx> wrote:
> > >
> > >> Hi Qi,
> > >> I have tested your patch on UNISOC s9863a.
> > >> Test case "kprobe_example & kretprobe_example" is OK.
> > >>
> > >> Two point:
> > >> 1. backtrace is not perfect.
> > >> optprobe_common does not saved frame pointer,
> > >> backtrace lacks two calls.
> > >> such as for dup_mm: lack copy_process-->dup_mm
> > >> dup_mm backtrace from your patch:
> > >> [ 832.387066] CPU: 0 PID: 296 Comm: sh Not tainted 5.16.0-rc5+ #8
> > >> [ 832.387078] Hardware name: Spreadtrum SP9863A-1H10 Board (DT)
> > >> [ 832.387083] Call trace:
> > >> [ 832.387086] dump_backtrace+0x0/0x1e0
> > >> [ 832.387103] show_stack+0x24/0x30
> > >> [ 832.387112] dump_stack_lvl+0x68/0x84
> > >> [ 832.387123] dump_stack+0x18/0x34
> > >> [ 832.387131] handler_pre+0x40/0x50 [kprobe_example]
> > >> [ 832.387143] opt_pre_handler+0x84/0xc0
> > >> [ 832.387154] optprobe_optimized_callback+0xec/0x164
> > >> [ 832.387164] optprobe_common+0x70/0xc4
> > >> [ 832.387173] kernel_clone+0x98/0x440
> > >> [ 832.387182] __do_sys_clone+0x54/0x80
> > >> [ 832.387191] __arm64_sys_clone+0x2c/0x40
> > >> [ 832.387199] invoke_syscall+0x50/0x120
> > >> [ 832.387208] el0_svc_common.constprop.0+0x4c/0xf4
> > >> [ 832.387217] do_el0_svc+0x30/0x9c
> > >> [ 832.387225] el0_svc+0x20/0x60
> > >> [ 832.387235] el0t_64_sync_handler+0xe8/0xf0
> > >> [ 832.387242] el0t_64_sync+0x1a0/0x1a4
> > >>
> > >>
> > >> dup_mm backtrace from other:
> > >> [ 173.352294] CPU: 6 PID: 309 Comm: sh Not tainted 5.16.0-rc5+ #19
> > >> [ 173.352301] Hardware name: Spreadtrum SP9863A-1H10 Board (DT)
> > >> [ 173.352304] Call trace:
> > >> [ 173.352307] dump_backtrace+0x0/0x1d4
> > >> [ 173.352319] show_stack+0x18/0x24
> > >> [ 173.352326] dump_stack_lvl+0x68/0x84
> > >> [ 173.352333] dump_stack+0x18/0x34
> > >> [ 173.352338] handler_pre+0x38/0x48 [kprobe_example]
> > >> [ 173.352347] opt_pre_handler+0x74/0xb0
> > >> [ 173.352354] optimized_callback+0x108/0x130
> > >> [ 173.352361] optinsn_slot+0x258/0x1000
> > >> [ 173.352366] dup_mm+0x4/0x4b0
> > >> [ 173.352373] copy_process+0x1284/0x1360
> > >> [ 173.352378] kernel_clone+0x5c/0x3c0
> > >> [ 173.352384] __do_sys_clone+0x54/0x80
> > >> [ 173.352390] __arm64_sys_clone+0x24/0x30
> > >> [ 173.352396] invoke_syscall+0x48/0x114
> > >> [ 173.352402] el0_svc_common.constprop.0+0x44/0xec
> > >> [ 173.352408] do_el0_svc+0x24/0x90
> > >> [ 173.352413] el0_svc+0x20/0x60
> > >> [ 173.352420] el0t_64_sync_handler+0xe8/0xf0
> > >> [ 173.352427] el0t_64_sync+0x1a0/0x1a4
> > >
> >
> > Hi Masami and Jianhua,
> >
> > optprobe_common() is added to minize size of code in trampoline, but
> > each trampoline is alloced as PAGE_SIZE, so optprobe_common() seems
> > unnecessary, and will make optprobe_trampoline.S much more complicated.
> > How about drop optprobe_common() and use a maro to reduce duplicate code .
> >
> 1. each trampoline is allocated as
> (MAX_OPTINSN_SIZE*sizeof(kprobe_opcode_t)), not PAGE_SIZE

Right. What Jianhua pointed was the alloc_optinsn_page() is expected to
return the PAGE_SIZE memory. Each trampoline "slot" is allocated on
that page. See __get_insn_slot()@kernel/kprobes.c for detail.

> 2. MAX_OPTINSN_SIZE should be "((unsigned long)(optprobe_template_end
> - optprobe_template_entry)),
> your MAX_OPTINSN_SIZE is not accurate.

Good catch! The MAX_OPTINSN_SIZE is not in byte, but in sizeof(kprobe_opcode_t).

+#define MAX_OPTINSN_SIZE \
+ ((unsigned long)optprobe_template_restore_end - (unsigned long)optprobe_template_entry)

This will calculate the MAX_OPTINSN_SIZE in byte. Hmm, arm 32bit
implementation has same mistake. Thanks for pointing it out!

> 3.optprobe_template_val in different kprobe may not be aligned with 8 byte.
> ldr instruction for this value, may use address that not aligned 8 byte.
> "ldr x0, 1f
> .global optprobe_template_common"

Ouch. Hmm, I think we should adjust MAX_OPTINSN_SIZE is always 2n, or
just add a nop (or additional .long 0) with a comment.

BTW, the filename of arch/arm64/kernel/probes/opt_arm64.c is redundant,
it should be arch/arm64/kernel/probes/optprobe.c, since we are sure
that is for arm64.

Thank you!

>
> Thanks,
> Jianhua
> > Thanks,
> > Qi
> > > Is the second one with your patch?
> > >
> > >>
> > >> 2. The reserve memory "OPT_SLOT_SIZE - PAGE_SIZE" is waste.
> > >> kernel/kprobe.c used only one PAGE_SIZE slot memory.
> > >
> > > Good catch!
> > > Qi, can you make an array (or bit map) of usage flags and
> > > manage the reserved memory?
> > >
> > > #define OPT_INSN_PAGES (OPT_SLOT_SIZE/PAGE_SIZE)
> > > static bool insn_page_in_use[OPT_INSN_PAGES];
> > >
> > > void *alloc_optinsn_page(void)
> > > {
> > > int i;
> > >
> > > for (i = 0; i < OPT_INSN_PAGES; i++)
> > > if (!insn_page_in_use[i])
> > > goto found;
> > > return NULL;
> > > found:
> > > insn_page_in_use[i] = true;
> > > return (void *)((unsigned long)optinsn_slot + PAGE_SIZE * i);
> > > }
> > >
> > > void free_optinsn_page(void *page)
> > > {
> > > unsigned long idx = (unsigned long)page - (unsigned long)optinsn_slot;
> > >
> > > WARN_ONCE(idx & (PAGE_SIZE - 1));
> > > idx >>= PAGE_SHIFT;
> > > if (WARN_ONCE(idx >= OPT_INSN_PAGES))
> > > return;
> > > insn_page_in_use[idx] = false;
> > > }
> > >
> > > Thank you,
> > >
> > >
> > >
> > >


--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>