Re: [RFC PATCH 00/22] perf tools: introduce 'perf bpf' command to load eBPF programs.

From: Alexei Starovoitov
Date: Mon May 04 2015 - 23:02:17 EST


On 5/2/15 12:19 AM, Wang Nan wrote:

I'd like to do following works in the next version (based on my experience and feedbacks):

1. Safely clean up kprobe points after unloading;

2. Add subcommand space to 'perf bpf'. Current staff should be reside in 'perf bpf load';

3. Extract eBPF ELF walking and collecting work to a separated library to help others.

that's a good list.

The feedback for existing patches:
patch 18 - since we're creating a generic library for bpf elf
loading it would great to do the following:
first try to load with
attr.log_buf = NULL;
attr.log_level = 0;
then only if it fails, allocate a buffer and repeat with log_level = 1.
The reason is that it's better to have fast program loading by default
without any verbosity emitted by verifier.

patch 19 - I think it's unnecessary.
verifier already dumps it. so this '-v' flag can be translated into
verbose loading.
There is also .s output from llvm for those interested in bpf asm
instructions.

My collage He Kuang is working on variable accessing. Probing inside function body
and accessing its local variable will be supported like this:

SEC("config") char _prog_config[] = "prog: func_name:1234 vara=localvara"
int prog(struct pt_regs *ctx, unsigned long vara) {
// vara is the value of localvara of function func_name
}

that would be great. I'm not sure though how you can achieve that
without changing C front-end ?
This type of feature is exactly the reason why we're trying to write
our front-end.
In general there are two ways to achieve 'restricted C' language:
- start from clang and chop all features that are not supported.
I believe Jovi already tried to do that and it became very difficult.
- start from simple front-end with minimal C and add all things one by
one. That's what we're trying to do. So far we have most of normal
syntax. The problem with our approach is that we cannot easily do
#include of existing .h files. We're working on that.
It's too experimental still. May be will be drop it and go back to
first approach.

The reason for extending front-end is your example above, where
the user would want to write:
int prog(struct pt_regs *ctx, unsigned long vara) {
// use 'vara'
but generated BPF should have only one 'ctx' pointer, since that's
the only thing that verifier will accept. bpf/core and JITs expect
only one argument, etc.
So this func definition + 'vara' access can be compiled as ctx->si
(if vara is actually in register) or
bpf_probe_read(ctx->bp + magic_offset_from_debug_info)
(if vara is on stack)
or it can also be done via store_trace_args() but that will be slower
and requires hacking kernel, whereas ctx->... style is pure userspace.
Lot's of things to brainstorm. So please share your progress soon.

And I want to discuss with you and others about:

1. How to make eBPF output its tracing and aggregation results to perf?

well, the output of bpf program is a data stored in maps. Each program
needs a corresponding user space reader/printer/sorter of this data.
Like tracex2 prints this data as histogram and tracex3 prints it as
heatmap. We can standardize few things like this, but ideally we
keep it up to user. So that user can write single file that consists
of functions that are loaded as bpf into kernel and other functions
that are executed in user space. llvm can jit first set to bpf and
second set to x86. That's distant future though.
So far samples/bpf/ style of kern.c+user.c worked quite well.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/