Re: [PATCH bpf-next 1/3] perf: enable branch record for software events

From: Peter Zijlstra
Date: Wed Aug 25 2021 - 08:10:30 EST


On Mon, Aug 23, 2021 at 11:01:55PM -0700, Song Liu wrote:

> arch/x86/events/intel/core.c | 5 ++++-
> arch/x86/events/intel/lbr.c | 12 ++++++++++++
> arch/x86/events/perf_event.h | 2 ++
> include/linux/perf_event.h | 33 +++++++++++++++++++++++++++++++++
> kernel/events/core.c | 28 ++++++++++++++++++++++++++++
> 5 files changed, 79 insertions(+), 1 deletion(-)

No PowerPC support :/

> +void intel_pmu_snapshot_branch_stack(void)
> +{
> + struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> +
> + intel_pmu_lbr_disable_all();
> + intel_pmu_lbr_read();
> + memcpy(this_cpu_ptr(&perf_branch_snapshot_entries), cpuc->lbr_entries,
> + sizeof(struct perf_branch_entry) * x86_pmu.lbr_nr);
> + *this_cpu_ptr(&perf_branch_snapshot_size) = x86_pmu.lbr_nr;
> + intel_pmu_lbr_enable_all(false);
> +}

Still has the layering violation and issues vs PMI.

> +#ifdef CONFIG_HAVE_STATIC_CALL
> +DECLARE_STATIC_CALL(perf_snapshot_branch_stack,
> + perf_default_snapshot_branch_stack);
> +#else
> +extern void (*perf_snapshot_branch_stack)(void);
> +#endif

That's weird, static call should work unconditionally, and fall back to
a regular function pointer exactly like you do here. Search for:
"Generic Implementation" in include/linux/static_call.h

> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 011cc5069b7ba..b42cc20451709 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c

> +#ifdef CONFIG_HAVE_STATIC_CALL
> +DEFINE_STATIC_CALL(perf_snapshot_branch_stack,
> + perf_default_snapshot_branch_stack);
> +#else
> +void (*perf_snapshot_branch_stack)(void) = perf_default_snapshot_branch_stack;
> +#endif

Idem.

Something like:

DEFINE_STATIC_CALL_NULL(perf_snapshot_branch_stack, void (*)(void));

with usage like: static_call_cond(perf_snapshot_branch_stack)();

Should unconditionally work.

> +int perf_read_branch_snapshot(void *buf, size_t len)
> +{
> + int cnt;
> +
> + memcpy(buf, *this_cpu_ptr(&perf_branch_snapshot_entries),
> + min_t(u32, (u32)len,
> + sizeof(struct perf_branch_entry) * MAX_BRANCH_SNAPSHOT));
> + cnt = *this_cpu_ptr(&perf_branch_snapshot_size);
> +
> + return (cnt > 0) ? cnt : -EOPNOTSUPP;
> +}

Doesn't seem used at all..