Re: [PATCH v5 0/9] Add OPTPROBES feature on RISCV

From: Björn Töpel
Date: Mon Jan 02 2023 - 13:03:24 EST


Chen Guokai <chenguokai17@xxxxxxxxxxxxxxxx> writes:

> Add jump optimization support for RISC-V.

Thank you for continuing to work on the series! I took the series for a
spin, and ran into a number of issues that makes me wonder how you test
the series, and how the testing is different from my runs.

I'll outline the general/big issues here, and leave the specifics per-patch.

I've done simple testing, using "Kprobe-based Event Tracing"
(CONFIG_KPROBE_EVENTS=y) via tracefs.

All the tests were run on commit 88603b6dc419 ("Linux 6.2-rc2") with the
series applied. All the bugs were trigged by setting different probes to
do_sys_openat2. Code:

do_sys_openat2:
...snip...
ffffffff802d138c: 89aa c.mv s3,a0 // +44
ffffffff802d138e: 892e c.mv s2,a1 // +46
ffffffff802d1390: 8532 c.mv a0,a2
ffffffff802d1392: fa040593 addi a1,s0,-96
ffffffff802d1396: 84b2 c.mv s1,a2
ffffffff802d1398: fa043023 sd zero,-96(s0)
ffffffff802d139c: fa043423 sd zero,-88(s0)
ffffffff802d13a0: fa042823 sw zero,-80(s0)
ffffffff802d13a4: 00000097 auipc ra,0x0
...snip...


1. Fail to register kprobe to c.mv

Add a kprobe:
echo 'p do_sys_openat2+44' > /sys/kernel/debug/tracing/kprobe_events

register_kprobe returns -22 (EINVAL). This is due to a bug in the
instruction decoder. I've sent to fix upstream [1].

2. (with [1] applied) Oops when register a probe

Add a kprobe:
echo 'p do_sys_openat2+44' > /sys/kernel/debug/tracing/kprobe_events

You get a splat:
Unable to handle kernel access to user memory without uaccess routines at virtual address 0000000000000008
Oops [#1]
Modules linked in:
CPU: 1 PID: 242 Comm: bash Tainted: G W 6.2.0-rc2-00010-g09ff1aa7b1f9-dirty #14
Hardware name: riscv-virtio,qemu (DT)
epc : riscv_probe_decode_insn+0x16a/0x192
ra : riscv_probe_decode_insn+0x32/0x192
epc : ffffffff8127b2bc ra : ffffffff8127b184 sp : ff2000000173bac0
gp : ffffffff82533f70 tp : ff60000086ab2b40 t0 : 0000000000000000
t1 : 0000000000000850 t2 : 65646f6365642054 s0 : ff2000000173bae0
s1 : 0000000000000017 a0 : 000000000000e001 a1 : 000000000000003f
a2 : 0000000000009002 a3 : 0000000000000017 a4 : 000000000000c001
a5 : ffffffff8127b38a a6 : ff6000047d666000 a7 : 0000000000040000
s2 : 0000000000000000 s3 : 0000000000000006 s4 : ff6000008558f718
s5 : ff6000008558f718 s6 : 0000000000000001 s7 : ff6000008558f768
s8 : 0000000000000007 s9 : 0000000000000003 s10: 0000000000000002
s11: 00aaaaaad62baf78 t3 : 0000000000000000 t4 : 8dd70b0100000000
t5 : ffffffffffffe000 t6 : ff2000000173b8c8
status: 0000000200000120 badaddr: 0000000000000008 cause: 000000000000000f
[<ffffffff81257e48>] arch_prepare_optimized_kprobe+0xc2/0x4ec
[<ffffffff8125b420>] alloc_aggr_kprobe+0x5c/0x6a
[<ffffffff8125ba0a>] register_kprobe+0x5dc/0x6a2
[<ffffffff8016f266>] __register_trace_kprobe.part.0+0x98/0xbc
[<ffffffff80170544>] __trace_kprobe_create+0x6ea/0xbcc
[<ffffffff80176cee>] trace_probe_create+0x6c/0x7c
[<ffffffff8016f1a2>] create_or_delete_trace_kprobe+0x24/0x50
[<ffffffff80150642>] trace_parse_run_command+0x9e/0x12a
[<ffffffff8016f176>] probes_write+0x18/0x20
[<ffffffff802d494a>] vfs_write+0xca/0x41e
[<ffffffff802d4f96>] ksys_write+0x70/0xee
[<ffffffff802d5036>] sys_write+0x22/0x2a
[<ffffffff80004196>] ret_from_syscall+0x0/0x2

This is because a call to riscv_probe_decode_insn(probe_opcode_t *addr,
struct arch_probe_insn *api), where api is NULL (and tripping over
auipc). Should be a common scenario...

3. No bound check for instructions

Add a probe to a non-valid instruction (in the middle of addi):
echo 'p 0xffffffff802d1394' > /sys/kernel/debug/tracing/kprobe_events

You get the same splat as above from the auipc NULL-pointer, but the
"half" addi-instruction is parsed as a correct instruction.

4. Lockdep splat

Might be false positive; When enabling a probe, e.g.
echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe/enable


======================================================
WARNING: possible circular locking dependency detected

------------------------------------------------------
bash/244 is trying to acquire lock:
ffffffff8223ee90 (cpu_hotplug_lock){++++}-{0:0}, at: stop_machine+0x2c/0x54

but task is already holding lock:
ffffffff82249f70 (text_mutex){+.+.}-{3:3}, at: ftrace_arch_code_modify_prepare+0x1a/0x22

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (text_mutex){+.+.}-{3:3}:
lock_acquire+0x10a/0x328
__mutex_lock+0xa8/0x770
mutex_lock_nested+0x28/0x30
register_kprobe+0x3ae/0x5ea
__register_trace_kprobe.part.0+0x98/0xbc
__trace_kprobe_create+0x6ea/0xbcc
trace_probe_create+0x6c/0x7c
create_or_delete_trace_kprobe+0x24/0x50
trace_parse_run_command+0x9e/0x12a
probes_write+0x18/0x20
vfs_write+0xca/0x41e
ksys_write+0x70/0xee
sys_write+0x22/0x2a
ret_from_syscall+0x0/0x2

-> #0 (cpu_hotplug_lock){++++}-{0:0}:
check_noncircular+0x122/0x13a
__lock_acquire+0x1058/0x20e4
lock_acquire+0x10a/0x328
cpus_read_lock+0x4c/0x11c
stop_machine+0x2c/0x54
arch_ftrace_update_code+0x2e/0x4c
ftrace_startup+0xd0/0x15e
register_ftrace_function+0x32/0x7c
arm_kprobe+0x132/0x198
enable_kprobe+0x9c/0xc0
enable_trace_kprobe+0x6e/0xea
kprobe_register+0x64/0x6c
__ftrace_event_enable_disable+0x72/0x246
event_enable_write+0x94/0xe4
vfs_write+0xca/0x41e
ksys_write+0x70/0xee
sys_write+0x22/0x2a
ret_from_syscall+0x0/0x2

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(text_mutex);
lock(cpu_hotplug_lock);
lock(text_mutex);
lock(cpu_hotplug_lock);

*** DEADLOCK ***

5 locks held by bash/244:
#0: ff60000080f49438 (sb_writers#12){.+.+}-{0:0}, at: ksys_write+0x70/0xee
#1: ffffffff822d9468 (event_mutex){+.+.}-{3:3}, at: event_enable_write+0x7c/0xe4
#2: ffffffff822d3fa8 (kprobe_mutex){+.+.}-{3:3}, at: enable_kprobe+0x32/0xc0
#3: ffffffff822d56d8 (ftrace_lock){+.+.}-{3:3}, at: register_ftrace_function+0x26/0x7c
#4: ffffffff82249f70 (text_mutex){+.+.}-{3:3}, at: ftrace_arch_code_modify_prepare+0x1a/0x22

stack backtrace:
CPU: 2 PID: 244 Comm: bash Not tainted 6.2.0-rc1-00008-g544b2c59fd81 #1
Hardware name: riscv-virtio,qemu (DT)
Call Trace:
[<ffffffff80006e80>] dump_backtrace+0x30/0x38
[<ffffffff81256e82>] show_stack+0x40/0x4c
[<ffffffff8126e054>] dump_stack_lvl+0x62/0x84
[<ffffffff8126e08e>] dump_stack+0x18/0x20
[<ffffffff8009b37e>] print_circular_bug+0x2ac/0x318
[<ffffffff8009b50c>] check_noncircular+0x122/0x13a
[<ffffffff8009e020>] __lock_acquire+0x1058/0x20e4
[<ffffffff8009f90c>] lock_acquire+0x10a/0x328
[<ffffffff8002fb8a>] cpus_read_lock+0x4c/0x11c
[<ffffffff8011ed60>] stop_machine+0x2c/0x54
[<ffffffff8013aec6>] arch_ftrace_update_code+0x2e/0x4c
[<ffffffff8013e796>] ftrace_startup+0xd0/0x15e
[<ffffffff8013e856>] register_ftrace_function+0x32/0x7c
[<ffffffff8012f928>] arm_kprobe+0x132/0x198
[<ffffffff8012fa2a>] enable_kprobe+0x9c/0xc0
[<ffffffff8016ff62>] enable_trace_kprobe+0x6e/0xea
[<ffffffff801700da>] kprobe_register+0x64/0x6c
[<ffffffff8015eba6>] __ftrace_event_enable_disable+0x72/0x246
[<ffffffff8015eeea>] event_enable_write+0x94/0xe4
[<ffffffff802d5e1a>] vfs_write+0xca/0x41e
[<ffffffff802d6466>] ksys_write+0x70/0xee
[<ffffffff802d6506>] sys_write+0x22/0x2a
[<ffffffff80004196>] ret_from_syscall+0x0/0x2


5. 32b support?

I've noticed that there code supports rv32. Is this tested? Does regular
kprobes work on 32b?


Thanks,
Björn


[1] https://lore.kernel.org/linux-riscv/20230102160748.1307289-1-bjorn@xxxxxxxxxx/