Re: [PATCH v3 0/2] kstats: kernel metric collector

From: Toke HÃiland-JÃrgensen
Date: Wed Feb 26 2020 - 18:11:45 EST


Luigi Rizzo <lrizzo@xxxxxxxxxx> writes:

> - the runtime cost and complexity of hooking bpf code is still a bit
> unclear to me. kretprobe or tracepoints are expensive, I suppose that
> some lean hook replace register_kretprobe() may exist and the
> difference from inline annotations would be marginal (we'd still need
> to put in the hooks around the code we want to time, though, so it
> wouldn't be a pure bpf solution). Any pointers to this are welcome;
> Alexei mentioned fentry/fexit and bpf trampolines, but I haven't found
> an example that lets me do something equivalent to kretprobe (take a
> timestamp before and one after a function without explicit
> instrumentation)

As Alexei said, with fentry/fexit the overhead should be on par with
your example. This functionality is pretty new, though, so I can
understand why it's not obvious how to do things with it yet :)

I think the best place to look is currently in selftests/bpf in the
kernel sources. Grep for 'fexit' and 'fentry' in the progs/ subdir.
test_overhead.c and kfree_skb.c seem to have some examples you may be
able to work from.

> - I still see some huge differences in usability, and this is in my
> opinion one very big difference between the two approaches. The
> systems where data collection may be of interest are not necessarily
> accessible to developers with the skills to write custom bpf code, or
> load bpf modules (security policies may prevent that). One thing is to
> tell a sysadmin to run "echo trace foo >
> /sys/kernel/debug/kstats/_config" or "watch grep CPUS
> /sys/kernel/debug/kstats/bar", another one is to tell them to load a
> bpf program (or write their own one).

With BPF the solution for this is to distribute a tool that does all the
setup for the user. Basically the userspace equivalent of what you're
proposing to include into the kernel here. You can make this arbitrarily
user-friendly, up to and including having a GUI list all the functions
available in the running kernel and letting the user just click on the
one to measure :)

-Toke