Re: [PATCH 08/14] perf test: Add memcpy thread test shell script

From: Carsten Haitzler
Date: Fri Jul 08 2022 - 05:19:11 EST




On 7/5/22 15:25, James Clark wrote:


On 01/07/2022 13:07, carsten.haitzler@xxxxxxxxxxxx wrote:
From: "Carsten Haitzler (Rasterman)" <raster@xxxxxxxxxxxxx>

Add a script to drive the threaded memcpy test that gathers data so
it passes a minimum bar for amount and quality of content that we
extract from the kernel's perf support.


On this one I get a failure about 1/50 times on N1SDP (I ran it about 150

I also see inconsistent results. The whole point of these tests is to point this out and provide data to track it and then lead eventually to improvements/fixes. A failing test is probably good - it found a problem. Perf test for me has lots of failures so I'm taking the position that failures are OK normally in perf test as long as you know what those failures are and why.

times and saw 3 failures so it's quite consistent). Usually it records
about a 1.4MB file with one aux record. But when it fails the file is
only 20K and has one small aux record:

0 0 0x1a10 [0x30]: PERF_RECORD_AUXTRACE size: 0x1820 offset: 0 ref: 0x1c23126d7ff3d2ab idx: 3 tid: 682799 cpu: 3

Nothing was dropped, and the load on the system wasn't any different
to when it passes. So I'm not sure if this is a real coresight bug
or that the test is flaky. There was a bug in SPE before where

The binary is the same with the same content running the same perf command every time. Workload doesn't change. The perf data captured does change. It sometimes captures so little it fails even the low pass bar given in the test.

threads weren't followed after forking, but only very rarely. It feels
a bit like that.

That ... would be a "CoreSight" bug though I think, not the test.

It could also be some contention issue because 10 threads are launched
but the machine only has 4 cores.

We still should be capturing data reliably (in theory). If you have 10 threads on a 4 core machine it'll take longer to run for the same workload as the threads will have to share the same cores, but this should still result in decent data collection as the cores switch between threads. That's the point.

The failure message from the test looks like this:

77: CoreSight / Memcpy 16k 10 Threads :
--- start ---
Couldn't synthesize bpf events.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.012 MB ./perf-memcpy_thread-16k_10.data ]
Sanity check number of ASYNC is too low (3 < 10)
---- end ----
CoreSight / Memcpy 16k 10 Threads: FAILED!

I didn't see this issue on any of the other tests. Sometimes very small
files were made if I loaded the system, but the tests still passed.

For me the "Check TID" tests fails very often... but as I said - the point here is to find issues and ensure they are reported in results. The test even track the results over time/many runs in the csv files so you get a good idea of consistency and even how it may statistically change over time matching that up to changes in the kernel.

Unless of course you think it's acceptable that sometimes perf record + CoreSight will output essentially no data (your 20k example). :)

Thanks
James

Signed-off-by: Carsten Haitzler <carsten.haitzler@xxxxxxx>
---
.../shell/coresight/memcpy_thread_16k_10.sh | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
create mode 100755 tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh

diff --git a/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh b/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh
new file mode 100755
index 000000000000..d21ba8545938
--- /dev/null
+++ b/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh
@@ -0,0 +1,18 @@
+#!/bin/sh -e
+# CoreSight / Memcpy 16k 10 Threads
+
+# SPDX-License-Identifier: GPL-2.0
+# Carsten Haitzler <carsten.haitzler@xxxxxxx>, 2021
+
+TEST="memcpy_thread"
+. $(dirname $0)/../lib/coresight.sh
+ARGS="16 10 1"
+DATV="16k_10"
+DATA="$DATD/perf-$TEST-$DATV.data"
+
+perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
+
+perf_dump_aux_verify "$DATA" 10 10 10
+
+err=$?
+exit $err