[PATCH v1 0/5] perf bench: Add rcu to the 'bench sync' collection

From: Yuzhuo Jing
Date: Thu Jul 31 2025 - 09:27:05 EST


Add an 'bench sync rcu' benchmark, using the kernel's rcuscale module.

This patch series adds the following features:
* Automatic rcuscale module load/unload and grace-period statistics.
(The statistics feature was derived from
tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuscale.sh.)
(patch 1)
* Simple benchmark specifying a list of parameters supported by
rcuscale. (patch 1)
* A feature to execute child process, and automatically replace
reader/writer threads ID placeholder strings. This allows child
process to attach to kernel threads to collect performance
statistics. (patch 2)
* Range-based benchmark that enumerates all combinations of parameter
ranges (patch 3).
* Ratio-based benchmark that scales between two parameters. (patch 4)
Example usages have been added to each patch commit message.

This patch series depends on the new features of an ongoing patch series
that exposes rcuscale module internal states and experiment results
through debugfs. That patch series is also required for programmatic
experiment start/finish controls.
Link: https://lore.kernel.org/lkml/20250730022347.71722-1-yuzhuo@xxxxxxxxxx/T/

RFCs:
* This patch series depends on the behavior of rcuscale kernel module.
In case of interface changes, especially aforementioned
"experiment results" format changes, this benchmark may break.
* The tools/testing/selftests/rcutorture suite provides a set of
scripts to run rcuscale, rcutorture, refscale in KVM, but left out
bare-metal testing. This patch series provides direct benchmarking
without KVM indirection. However, they reside in different folders.
Is there a better way to integrate both suites?
* (Patch 3) What would be a better range format? The current format
is defined as start[:end:step], and is only for integers.
Potentially we may want ranges for non-integers, or relationships
from expressions.

The patches are based on an ongoing series. Specifically, the minor
changes in builtin-bench.c may prevent applying change cleanly to
master/HEAD. Though the sync-rcu.c itself is independent of the lock
benchmarks from previous series.
Link: https://lore.kernel.org/lkml/20250729022640.3134066-1-yuzhuo@xxxxxxxxxx/T/
Link: https://lore.kernel.org/lkml/20250729081256.3433892-1-yuzhuo@xxxxxxxxxx/T/

Yuzhuo Jing (5):
perf bench: Add RCU benchmark using rcuscale kernel module
perf bench: Implement subprocess execution for 'sync rcu'
perf bench: Add 'range' mode to 'sync rcu'
perf bench: Add 'ratio' mode to 'sync rcu'
perf bench: Add documentation for 'sync rcu' suite

tools/perf/Documentation/perf-bench.txt | 131 +++
tools/perf/bench/Build | 1 +
tools/perf/bench/bench.h | 1 +
tools/perf/bench/sync-rcu.c | 1319 +++++++++++++++++++++++
tools/perf/builtin-bench.c | 1 +
5 files changed, 1453 insertions(+)
create mode 100644 tools/perf/bench/sync-rcu.c

--
2.50.1.565.gc32cd1483b-goog