Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test

From: Steven Rostedt
Date: Thu Apr 28 2022 - 09:58:15 EST


On Sat, 16 Apr 2022 23:21:03 +0900
Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:

> OK, I also confirmed that __bpf_tramp_exit is listed. (others seems no notrace)
>
> /sys/kernel/tracing # cat available_filter_functions | grep __bpf_tramp
> __bpf_tramp_image_release
> __bpf_tramp_image_put_rcu
> __bpf_tramp_image_put_rcu_tasks
> __bpf_tramp_image_put_deferred
> __bpf_tramp_exit
>
> My gcc is older one.
> gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
>
> But it seems that __bpf_tramp_exit() doesn't call __fentry__. (I objdump'ed)
>
> ffffffff81208270 <__bpf_tramp_exit>:
> ffffffff81208270: 55 push %rbp
> ffffffff81208271: 48 89 e5 mov %rsp,%rbp
> ffffffff81208274: 53 push %rbx
> ffffffff81208275: 48 89 fb mov %rdi,%rbx
> ffffffff81208278: e8 83 70 ef ff callq ffffffff810ff300 <__rcu_read_lock>
> ffffffff8120827d: 31 d2 xor %edx,%edx

You need to look deeper ;-)
>
>
> >
> > So it's quite bizarre and inconsistent.
>
> Indeed. I guess there is a bug in scripts/recordmcount.pl.

No there isn't.

I added the addresses it was mapping and found this:

ffffffffa828f680 T __bpf_tramp_exit

(which is relocated, but it's trivial to map it with the actual function).

At the end of that function we have:

ffffffff8128f767: 48 8d bb e0 00 00 00 lea 0xe0(%rbx),%rdi
ffffffff8128f76e: 48 8b 40 08 mov 0x8(%rax),%rax
ffffffff8128f772: e8 89 28 d7 00 call ffffffff82002000 <__x86_indirect_thunk_array>
ffffffff8128f773: R_X86_64_PLT32 __x86_indirect_thunk_rax-0x4
ffffffff8128f777: e9 4a ff ff ff jmp ffffffff8128f6c6 <__bpf_tramp_exit+0x46>
ffffffff8128f77c: 0f 1f 40 00 nopl 0x0(%rax)
ffffffff8128f780: e8 8b df dc ff call ffffffff8105d710 <__fentry__>
ffffffff8128f781: R_X86_64_PLT32 __fentry__-0x4
ffffffff8128f785: b8 f4 fd ff ff mov $0xfffffdf4,%eax
ffffffff8128f78a: c3 ret
ffffffff8128f78b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)


Notice the call to fentry!

It's due to this:

void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr)
{
percpu_ref_put(&tr->pcref);
}

int __weak
arch_prepare_bpf_trampoline(struct bpf_tramp_image *tr, void *image, void *image_end,
const struct btf_func_model *m, u32 flags,
struct bpf_tramp_progs *tprogs,
void *orig_call)
{
return -ENOTSUPP;
}

The weak function gets a call to ftrace, but it still gets compiled into
vmlinux but its symbol is dropped due to it being overridden. Thus, the
mcount_loc finds this call to fentry, and maps it to the symbol that is
before it, which just happened to be __bpf_tramp_exit.

I made that weak function "notrace" and the __bpf_tramp_exit disappeared
from the available_filter_functions list.

-- Steve