Re: [PATCH v3 perf, bpf-next 1/4] perf, bpf: Introduce PERF_RECORD_BPF_EVENT

From: Song Liu
Date: Wed Dec 12 2018 - 12:10:00 EST




> On Dec 12, 2018, at 5:15 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Tue, Dec 11, 2018 at 03:33:47PM -0800, Song Liu wrote:
>> For better performance analysis of BPF programs, this patch introduces
>> PERF_RECORD_BPF_EVENT, a new perf_event_type that exposes BPF program
>> load/unload information to user space.
>>
>> Each BPF program may contain up to BPF_MAX_SUBPROGS (256) sub programs.
>> The following example shows kernel symbols for a BPF program with 7
>> sub programs:
>>
>> ffffffffa0257cf9 t bpf_prog_b07ccb89267cf242_F
>> ffffffffa02592e1 t bpf_prog_2dcecc18072623fc_F
>> ffffffffa025b0e9 t bpf_prog_bb7a405ebaec5d5c_F
>> ffffffffa025dd2c t bpf_prog_a7540d4a39ec1fc7_F
>> ffffffffa025fcca t bpf_prog_05762d4ade0e3737_F
>> ffffffffa026108f t bpf_prog_db4bd11e35df90d4_F
>> ffffffffa0263f00 t bpf_prog_89d64e4abf0f0126_F
>> ffffffffa0257cf9 t bpf_prog_ae31629322c4b018__dummy_tracepoi
>
> Doesn't BPF have enough information to generate 'saner' names? Going by
> the thing below, these sub-progs are actually functions, right?

These sub programs/functions will have their descriptive names from BTF
function types (coming in 4.21). However, BTF is optional in normal cases,
when BTF is missing, they will be named as bpf_prog_<tag>_F. The main BPF
program has a name up to 16 byte long. In the example above, the last
program has name _dummy_tracepoint.

I think these sub programs are more like "programs" than "functions",
because each sub program occupies its own page(s).

>
>> /*
>> * Record different types of bpf events:
>> * enum perf_bpf_event_type {
>> * PERF_BPF_EVENT_UNKNOWN = 0,
>> * PERF_BPF_EVENT_PROG_LOAD = 1,
>> * PERF_BPF_EVENT_PROG_UNLOAD = 2,
>> * };
>> *
>> * struct {
>> * struct perf_event_header header;
>> * u32 type;
>> * u32 flags;
>> * u32 id; // prog_id or other id
>> * u32 sub_id; // subprog id
>> *
>> * // for bpf_prog types, bpf prog or subprog
>> * u8 tag[BPF_TAG_SIZE];
>> * u64 addr;
>> * u64 len;
>> * char name[];
>> * struct sample_id sample_id;
>> * };
>> */
>
> Isn't this mixing two different things (poorly)? The kallsym update and
> the BPF load/unload ?

I would say these two things are actually two parts of the same event.
Fields id, sub_id, and tag provide information about which program is
mapped to this ksym. They are equivalent to "pgoff + filename" in
PERF_RECORD_MMAP, or "maj, min, ino, and ino_generation" in
PERF_RECORD_MMAP2.

>
> And while this tracks the bpf kallsyms, it does not do all kallsyms.
>
> .... Oooh, I see the problem, everybody is doing their own custom
> kallsym_{add,del}() thing, instead of having that in generic code :-(
>
> This, for example, doesn't track module load/unload nor ftrace
> trampolines, even though both affect kallsyms.

I think we can use PERF_RECORD_MMAP(or MMAP2) for module load/unload.
That could be separate sets of patches.

Thanks,
Song

>
>> +void perf_event_bpf_event_prog(enum perf_bpf_event_type type,
>> + struct bpf_prog *prog)
>> +{
>> + if (!atomic_read(&nr_bpf_events))
>> + return;
>> +
>> + if (type != PERF_BPF_EVENT_PROG_LOAD &&
>> + type != PERF_BPF_EVENT_PROG_UNLOAD)
>> + return;
>> +
>> + if (prog->aux->func_cnt == 0) {
>> + perf_event_bpf_event_subprog(type, prog,
>> + prog->aux->id, 0);
>> + } else {
>> + int i;
>> +
>> + for (i = 0; i < prog->aux->func_cnt; i++)
>> + perf_event_bpf_event_subprog(type, prog->aux->func[i],
>> + prog->aux->id, i);
>> + }
>> +}
>
>