Re: Broken kretprobe stack traces

From: Masami Hiramatsu
Date: Wed Mar 03 2021 - 06:56:39 EST


Hi Daniel,

On Tue, 02 Mar 2021 17:15:13 -0800
"Daniel Xu" <dxu@xxxxxxxxx> wrote:

> Hi Masami,
>
> Jakub reported a bug with kretprobe stack traces -- wondering if you've gotten
> any bug reports related to stack traces being broken for kretprobes.

Yeah, stack dumper must check the stack entry is kretprobe'd and skip it.
For example, ftrace checks it as below.

/sys/kernel/debug/tracing # echo r vfs_read > kprobe_events
/sys/kernel/debug/tracing # echo stacktrace > events/kprobes/r_vfs_read_0/trigger
/sys/kernel/debug/tracing # echo 1 > events/kprobes/r_vfs_read_0/enable
/sys/kernel/debug/tracing # head -20 trace
# tracer: nop
#
# entries-in-buffer/entries-written: 15685/336094 #P:8
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
sh-132 [006] ...1 38.920352: <stack trace>
=> kretprobe_dispatcher
=> __kretprobe_trampoline_handler
=> trampoline_handler
=> [unknown/kretprobe'd]
=> [unknown/kretprobe'd]
=> __x64_sys_read
=> do_syscall_64
=> entry_SYSCALL_64_after_hwframe


>
> I think (can't prove) this used to work:

I'm not sure the bpftrace had correctly handled it or not.

>
> # bpftrace -e 'kretprobe:__tcp_retransmit_skb { @[kstack()] = count() }'
> Attaching 1 probe...
> ^C
>
> @[
> kretprobe_trampoline+0
> ]: 1

Would you know how the bpftrace stacktracer rewinds the stack entries?
FYI, ftrace does it in trace_seq_print_sym()@kernel/trace/trace_output.c

Thank you,

>
> fentry/fexit probes seem to work:
>
> # bpftrace -e 'kretfunc:__tcp_retransmit_skb { @[kstack()] = count() }'
> Attaching 1 probe...
> ^C
>
> @[
> ftrace_trampoline+10799
> bpf_get_stackid_raw_tp+121
> ftrace_trampoline+10799
> __tun_chr_ioctl.isra.0.cold+33312
> __tcp_retransmit_skb+5
> tcp_send_loss_probe+254
> tcp_write_timer_handler+394
> tcp_write_timer+149
> call_timer_fn+41
> __run_timers+493
> run_timer_softirq+25
> __softirqentry_text_start+207
> asm_call_sysvec_on_stack+18
> do_softirq_own_stack+55
> irq_exit_rcu+158
> sysvec_apic_timer_interrupt+54
> asm_sysvec_apic_timer_interrupt+18
> ]: 1
> @[
> ftrace_trampoline+10799
> bpf_get_stackid_raw_tp+121
> ftrace_trampoline+10799
> __tun_chr_ioctl.isra.0.cold+33312
> __tcp_retransmit_skb+5
> <...>
>
> which makes me suspect it's a kprobe specific issue.
>
> Thanks,
> Daniel


--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>