Re: [PATCH] libbpf: resolve kernel function name optimization for kprobe

From: Alan Maguire
Date: Thu Jan 12 2023 - 05:26:47 EST


On 12/01/2023 07:23, Yonghong Song wrote:
>
>
> On 1/9/23 7:11 PM, Menglong Dong wrote:
>> On Tue, Jan 10, 2023 at 4:29 AM Yonghong Song <yhs@xxxxxxxx> wrote:
>>>
>>>
>>>
>>> On 1/9/23 1:42 AM, menglong8.dong@xxxxxxxxx wrote:
>>>> From: Menglong Dong <imagedong@xxxxxxxxxxx>
>>>>
>>>> The function name in kernel may be changed by the compiler. For example,
>>>> the function 'ip_rcv_core' can be compiled to 'ip_rcv_core.isra.0'.
>>>>
>>>> This kind optimization can happen in any kernel function. Therefor, we
>>>> should conside this case.
>>>>
>>>> If we failed to attach kprobe with a '-ENOENT', then we can lookup the
>>>> kallsyms and check if there is a similar function end with '.xxx', and
>>>> retry.
>>>
>>> This might produce incorrect result, so this approach won't work
>>> for all .isra.0 cases. When a function name is changed from
>>> <func> to <func>.isra.<num>, it is possible that compiler may have
>>> make some changes to the arguments, e.g., removing one argument,
>>> chaning a semantics of argument, etc. if bpf program still
>>> uses the original function signature, the bpf program may
>>> produce unexpected result.
>>
>> Oops, I wasn't aware of this part. Can we make this function disabled
>> by default and offer an option to users to enable it? Such as:
>>
>>      bpf_object_adapt_sym(struct bpf_object *obj)
>>
>> In my case, kernel function rename is common, and I have to
>> check all functions and do such adaptation before attaching
>> my kprobe programs, which makes me can't use auto-attach.
>>
>> What's more, I haven't seen the arguments change so far, and
>> maybe it's not a common case?
>
> I don't have statistics, but it happens. In general, if you
> want to attach to a function like <foo>, but it has a variant
> <foo>.isra.<num>, you probably should check assembly code
> to ensure the parameter semantics not changed, and then
> you can attach to kprobe function <foo>.isra.<num>, which
> I assume current libbpf infrastructure should support it.
> After you investigate all these <foo>.isra.<num> functions
> and confirm their argument semantics won't change, you
> could use kprobe multi to do attachment.
>

I crunched some numbers on this, and discovered out of ~1600
.isra/.constprop functions, 76 had a missing argument. The patch series
at [1] is a rough attempt to get pahole to spot these, and add
BTF entries for each, where the BTF representation reflects
reality by skipping optimized-out arguments. So for a function
like

static int ip6_nh_lookup_table(struct net *net, struct fib6_config *cfg,
const struct in6_addr *gw_addr, u32 tbid,
int flags, struct fib6_result *res);

Examining the BTF representation using pahole from [1], we see

int ip6_nh_lookup_table.isra.0(struct net *net, struct fib6_config *cfg, struct in6_addr *gw_addr, u32 tbid, int flags);

Comparing to the definition, we see the last parameter is missing,
i.e. the "struct fib6_result *" argument is missing. The calling pattern -
where the callers have a struct fib6_result on the stack and pass a pointer -
is reflected in late DWARF info which shows the argument is not actually
passed as a register, but can be expressed as an offset relative to the current
function stack (DW_OP_fbreg).

This approach howvever introduces the problem that currently the kernel
doesn't allow a "." in a function name. We can fix that, but any BTF encoding
that introduced optimized functions containing a "." would have to be opt-in
via a pahole option, so we do not generate invalid vmlinux BTF for kernels
without that change.

An alternative approach would be to simply encode .isra functions
in BTF without the .isra suffix (i.e. using "function_name" not
"function_name.isra"), only doing the BTF encoding if no arguments were
optimized out - i.e. if the function signature matches expectations.
The 76 functions with optimized-out parameters could simply be skipped.
To me that feels like the simpler approach - it avoids issues
with function name BTF encoding, and with that sort of model a
loose-matching kallsyms approach - like that described here - could be used
for kprobes and fentry/fexit. It also fits with the DWARF representation -
the .isra suffixes are not present in DWARF representations of the function,
only in the symbol table and kallsyms, so perhaps BTF should follow suit
and not add the suffixes. What do you think?

Alan

[1] https://github.com/acmel/dwarves/compare/master...alan-maguire:dwarves:optimized
>>
>> (Please just ignore this reply if it doesn't work :/ )
>>
>> Thanks!
>> Menglong Dong
>>>
>>>>
>>>> Signed-off-by: Menglong Dong <imagedong@xxxxxxxxxxx>
>>>> ---
>>>>    tools/lib/bpf/libbpf.c | 37 ++++++++++++++++++++++++++++++++++++-
>>>>    1 file changed, 36 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
>>>> index a5c67a3c93c5..fdfb1ca34ced 100644
>>>> --- a/tools/lib/bpf/libbpf.c
>>>> +++ b/tools/lib/bpf/libbpf.c
>>>> @@ -10375,12 +10375,30 @@ bpf_program__attach_kprobe_multi_opts(const struct bpf_program *prog,
>>>>        return libbpf_err_ptr(err);
>>>>    }
>>>>
>>>> +struct kprobe_resolve {
>>>> +     char pattern[128];
>>>> +     char name[128];
>>>> +};
>>>> +
>>>> +static int kprobe_kallsyms_cb(unsigned long long sym_addr, char sym_type,
>>>> +                           const char *sym_name, void *ctx)
>>>> +{
>>>> +     struct kprobe_resolve *res = ctx;
>>>> +
>>>> +     if (!glob_match(sym_name, res->pattern))
>>>> +             return 0;
>>>> +     strcpy(res->name, sym_name);
>>>> +     return 1;
>>>> +}
>>>> +
>>>>    static int attach_kprobe(const struct bpf_program *prog, long cookie, struct bpf_link **link)
>>>>    {
>>>>        DECLARE_LIBBPF_OPTS(bpf_kprobe_opts, opts);
>>>> +     struct kprobe_resolve res = {};
>>>>        unsigned long offset = 0;
>>>>        const char *func_name;
>>>>        char *func;
>>>> +     int err;
>>>>        int n;
>>>>
>>>>        *link = NULL;
>>>> @@ -10408,8 +10426,25 @@ static int attach_kprobe(const struct bpf_program *prog, long cookie, struct bpf
>>>>
>>>>        opts.offset = offset;
>>>>        *link = bpf_program__attach_kprobe_opts(prog, func, &opts);
>>>> +     err = libbpf_get_error(*link);
>>>> +
>>>> +     if (!err || err != -ENOENT)
>>>> +             goto out;
>>>> +
>>>> +     sprintf(res.pattern, "%s.*", func);
>>>> +     if (!libbpf_kallsyms_parse(kprobe_kallsyms_cb, &res))
>>>> +             goto out;
>>>> +
>>>> +     pr_warn("prog '%s': trying to create %s '%s+0x%zx' perf event instead\n",
>>>> +             prog->name, opts.retprobe ? "kretprobe" : "kprobe",
>>>> +             res.name, offset);
>>>> +
>>>> +     *link = bpf_program__attach_kprobe_opts(prog, res.name, &opts);
>>>> +     err = libbpf_get_error(*link);
>>>> +
>>>> +out:
>>>>        free(func);
>>>> -     return libbpf_get_error(*link);
>>>> +     return err;
>>>>    }
>>>>
>>>>    static int attach_ksyscall(const struct bpf_program *prog, long cookie, struct bpf_link **link)