Re: [PATCH bpf-next 1/2] bpf: Add a bpf_kallsyms_lookup helper

From: Alexei Starovoitov
Date: Wed Dec 02 2020 - 16:19:20 EST


On Wed, Dec 2, 2020 at 12:32 PM Florent Revest <revest@xxxxxxxxxxxx> wrote:
>
> On Tue, 2020-12-01 at 16:55 -0800, Andrii Nakryiko wrote:
> > On Fri, Nov 27, 2020 at 8:09 AM Yonghong Song <yhs@xxxxxx> wrote:
> > >
> > >
> > > On 11/27/20 3:20 AM, KP Singh wrote:
> > > > On Fri, Nov 27, 2020 at 8:35 AM Yonghong Song <yhs@xxxxxx> wrote:
> > > > >
> > > > > In this case, module name may be truncated and user did not get
> > > > > any indication from return value. In the helper description, it
> > > > > is mentioned that module name currently is most 64 bytes. But
> > > > > from UAPI perspective, it may be still good to return something
> > > > > to let user know the name is truncated.
> > > > >
> > > > > I do not know what is the best way to do this. One suggestion
> > > > > is to break it into two helpers, one for symbol name and
> > > > > another
> > > >
> > > > I think it would be slightly preferable to have one helper
> > > > though. maybe something like bpf_get_symbol_info (better names
> > > > anyone? :)) with flags to get the module name or the symbol name
> > > > depending
> > > > on the flag?
> > >
> > > This works even better. Previously I am thinking if we have two
> > > helpers,
> > > we can add flags for each of them for future extension. But we
> > > can certainly have just one helper with flags to indicate
> > > whether this is for module name or for symbol name or something
> > > else.
> > >
> > > The buffer can be something like
> > > union bpf_ksymbol_info {
> > > char module_name[];
> > > char symbol_name[];
> > > ...
> > > }
> > > and flags will indicate what information user wants.
> >
> > one more thing that might be useful to resolve to the symbol's "base
> > address". E.g., if we have IP inside the function, this would resolve
> > to the start of the function, sort of "canonical" symbol address.
> > Type of ksym is another "characteristic" which could be returned (as
> > a single char?)
> >
> > I wouldn't define bpf_ksymbol_info, though. Just depending on the
> > flag, specify what kind of memory layou (e.g., for strings -
> > zero-terminated string, for address - 8 byte numbers, etc). That way
> > we can also allow fetching multiple things together, they would just
> > be laid out one after another in memory.
> >
> > E.g.:
> >
> > char buf[256];
> > int err = bpf_ksym_resolve(<addr>, BPF_KSYM_NAME | BPF_KSYM_MODNAME |
> > BPF_KSYM_BASE_ADDR, buf, sizeof(buf));
> >
> > if (err == -E2BIG)
> > /* need bigger buffer, but all the data up to truncation point is
> > filled in */
> > else
> > /* err has exact number of bytes used, including zero terminator(s)
> > */
> > /* data is laid out as
> > "cpufreq_gov_powersave_init\0cpufreq_powersave\0\x12\x23\x45\x56\x12\
> > x23\x45\x56"
> > */
>
> Great idea! I like that, thanks for the suggestion :)

I still think that adopting printk/vsnprintf for this instead of
reinventing the wheel
is more flexible and easier to maintain long term.
Almost the same layout can be done with vsnprintf
with exception of \0 char.
More meaningful names, etc.
See Documentation/core-api/printk-formats.rst
If we force fmt to come from readonly map then bpf_trace_printk()-like
run-time check of fmt string can be moved into load time check
and performance won't suffer.