Re: [PATCH bpf-next v4 2/2] bpf: Use per-cpu BPF callchain entry to save callchain

From: Tao Chen

Date: Wed Oct 22 2025 - 12:09:37 EST


在 2025/10/22 01:28, Alexei Starovoitov 写道:
On Tue, Oct 21, 2025 at 9:07 AM Tao Chen <chen.dylane@xxxxxxxxx> wrote:

As Alexei noted, get_perf_callchain() return values may be reused
if a task is preempted after the BPF program enters migrate disable
mode. Drawing on the per-cpu design of bpf_bprintf_buffers,
per-cpu BPF callchain entry is used here.

Signed-off-by: Tao Chen <chen.dylane@xxxxxxxxx>
---
kernel/bpf/stackmap.c | 98 ++++++++++++++++++++++++++++++++-----------
1 file changed, 73 insertions(+), 25 deletions(-)

diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index 94e46b7f340..97028d39df1 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -31,6 +31,52 @@ struct bpf_stack_map {
struct stack_map_bucket *buckets[] __counted_by(n_buckets);
};

+struct bpf_perf_callchain_entry {
+ u64 nr;
+ u64 ip[PERF_MAX_STACK_DEPTH];
+};
+
+#define MAX_PERF_CALLCHAIN_PREEMPT 3
+static DEFINE_PER_CPU(struct bpf_perf_callchain_entry[MAX_PERF_CALLCHAIN_PREEMPT],
+ bpf_perf_callchain_entries);
+static DEFINE_PER_CPU(int, bpf_perf_callchain_preempt_cnt);

This is too much extra memory. Above adds 1k * 3 * num_cpus.
Let's reuse perf callchains.
Especially since they're controlled by perf_event_max_stack sysctl.
See Peter's suggestion in v3.
And for the future don't respin so quickly.

Ok, let's base our discussion on v3, sorry for the overly rapid version iterations impacted the maintainers' review process, i will respin slowly, thanks.

--
Best Regards
Tao Chen