[PATCH v2 0/2] Optimize performance of update hash-map when free is zero

From: Feng zhou
Date: Tue May 24 2022 - 03:53:29 EST


From: Feng Zhou <zhoufeng.zf@xxxxxxxxxxxxx>

We encountered bad case on big system with 96 CPUs that
alloc_htab_elem() would last for 1ms. The reason is that after the
prealloc hashtab has no free elems, when trying to update, it will still
grab spin_locks of all cpus. If there are multiple update users, the
competition is very serious.

0001: Add is_empty to check whether the free list is empty or not before taking
the lock.
0002: Add benchmark to reproduce this worst case.

Changelog:
v1->v2: Addressed comments from Alexei Starovoitov.
- add a benchmark to reproduce the issue.
- Adjust the code format that avoid adding indent.
some details in here:
https://lore.kernel.org/all/877ac441-045b-1844-6938-fcaee5eee7f2@xxxxxxxxxxxxx/T/

Feng Zhou (2):
bpf: avoid grabbing spin_locks of all cpus when no free elems
selftest/bpf/benchs: Add bpf_map benchmark

kernel/bpf/percpu_freelist.c | 28 ++++++-
kernel/bpf/percpu_freelist.h | 1 +
tools/testing/selftests/bpf/Makefile | 4 +-
tools/testing/selftests/bpf/bench.c | 2 +
.../selftests/bpf/benchs/bench_bpf_map.c | 78 +++++++++++++++++++
.../selftests/bpf/benchs/run_bench_bpf_map.sh | 10 +++
.../selftests/bpf/progs/bpf_map_bench.c | 27 +++++++
7 files changed, 146 insertions(+), 4 deletions(-)
create mode 100644 tools/testing/selftests/bpf/benchs/bench_bpf_map.c
create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_bpf_map.sh
create mode 100644 tools/testing/selftests/bpf/progs/bpf_map_bench.c

--
2.20.1