Re: [PATCH 1/2] perf evlist: Keep topdown counters in weak group

From: Liang, Kan
Date: Thu May 05 2022 - 14:18:40 EST




On 5/5/2022 11:18 AM, Ian Rogers wrote:
On Thu, May 5, 2022 at 4:56 AM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:

Hi Ian,

Thanks for the patches.

On 5/5/2022 12:38 AM, Ian Rogers wrote:
On Intel Icelake, topdown events must always be grouped with a slots
event as leader. When a metric is parsed a weak group is formed and
retried if perf_event_open fails. The retried events aren't grouped
breaking the slots leader requirement. This change modifies the weak
group "reset" behavior so that topdown events aren't broken from the
group for the retry.

$ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1

Performance counter stats for 'system wide':

47,867,188,483 slots (92.27%)
<not supported> topdown-bad-spec
<not supported> topdown-be-bound
<not supported> topdown-fe-bound
<not supported> topdown-retiring
2,173,346,937 branch-instructions (92.27%)
10,540,253 branch-misses # 0.48% of all branches (92.29%)
96,291,140 bus-cycles (92.29%)
6,214,202 cache-misses # 20.120 % of all cache refs (92.29%)
30,886,082 cache-references (76.91%)
11,773,726,641 cpu-cycles (84.62%)
11,807,585,307 instructions # 1.00 insn per cycle (92.31%)
0 mem-loads (92.32%)
2,212,928,573 mem-stores (84.69%)
10,024,403,118 ref-cycles (92.35%)
16,232,978 baclears.any (92.35%)
23,832,633 ARITH.DIVIDER_ACTIVE (84.59%)

0.981070734 seconds time elapsed

After:

$ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1

Performance counter stats for 'system wide':

31040189283 slots (92.27%)
8997514811 topdown-bad-spec # 28.2% bad speculation (92.27%)
10997536028 topdown-be-bound # 34.5% backend bound (92.27%)
4778060526 topdown-fe-bound # 15.0% frontend bound (92.27%)
7086628768 topdown-retiring # 22.2% retiring (92.27%)
1417611942 branch-instructions (92.26%)
5285529 branch-misses # 0.37% of all branches (92.28%)
62922469 bus-cycles (92.29%)
1440708 cache-misses # 8.292 % of all cache refs (92.30%)
17374098 cache-references (76.94%)
8040889520 cpu-cycles (84.63%)
7709992319 instructions # 0.96 insn per cycle (92.32%)
0 mem-loads (92.32%)
1515669558 mem-stores (84.68%)
6542411177 ref-cycles (92.35%)
4154149 baclears.any (92.35%)
20556152 ARITH.DIVIDER_ACTIVE (84.59%)

1.010799593 seconds time elapsed

Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
---
tools/perf/arch/x86/util/evsel.c | 12 ++++++++++++
tools/perf/util/evlist.c | 16 ++++++++++++++--
tools/perf/util/evsel.c | 10 ++++++++++
tools/perf/util/evsel.h | 3 +++
4 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
index ac2899a25b7a..40b171de2086 100644
--- a/tools/perf/arch/x86/util/evsel.c
+++ b/tools/perf/arch/x86/util/evsel.c
@@ -3,6 +3,7 @@
#include <stdlib.h>
#include "util/evsel.h"
#include "util/env.h"
+#include "util/pmu.h"
#include "linux/string.h"

void arch_evsel__set_sample_weight(struct evsel *evsel)
@@ -29,3 +30,14 @@ void arch_evsel__fixup_new_cycles(struct perf_event_attr *attr)

free(env.cpuid);
}
+
+bool arch_evsel__must_be_in_group(const struct evsel *evsel)
+{
+ if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
+ !pmu_have_event("cpu", "slots"))
+ return false;
+

The big core of ADL also supports Perf_metrics. At least, we should
check both "cpu" and "cpu_core" here.

Thanks,
Kan

Thanks Kan, we have the same problem for
arch_evlist__add_default_attrs and arch_evlist__leader:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/arch/x86/util/evlist.c?h=perf/core

These are known issues. Zhengjun already have some patches to fix them, but hasn't posted yet.

The arch_evlist__add_default_attrs impacts the perf stat default. On the ADL, perf stat default doesn't display the topdown events.

The arch_evlist__leader impacts the sampling read of the topdown events. Sampling read the topdown events should always fail with the current perf tool on ADL.

We already have many backlogs on ADL. I think it's better not to introduce more issues.

So I think fixing all of these should be a follow up. I am working to
get access to an Alderlake system, could we land this first?


I think we can use pmu_name to replace the "cpu" to fix the issue for the hybrid platform. For a hybrid platform, the pmu_name is either cpu_atom or cpu_core.

Besides, the topdown events may have a PMU prefix, e.g., cpu_core/topdown-be-bound/. The strcasecmp may not work well for this case.

How about the below patch?
If it's OK for you, could you please merge it into your V2 patch set?
I can do the test on a ADL system.

diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
index 40b171de2086..551ae2bab70e 100644
--- a/tools/perf/arch/x86/util/evsel.c
+++ b/tools/perf/arch/x86/util/evsel.c
@@ -33,11 +33,12 @@ void arch_evsel__fixup_new_cycles(struct perf_event_attr *attr)

bool arch_evsel__must_be_in_group(const struct evsel *evsel)
{
- if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
- !pmu_have_event("cpu", "slots"))
+ const char *pmu_name = evsel->pmu_name ? evsel->pmu_name : "cpu";
+
+ if (!pmu_have_event(pmu_name, "slots"))
return false;

return evsel->name &&
- (!strcasecmp(evsel->name, "slots") ||
- !strncasecmp(evsel->name, "topdown", 7));
+ (strcasestr(evsel->name, "slots") ||
+ strcasestr(evsel->name, "topdown"));
}



Thanks,
Kan

Thanks,
Ian

+ return evsel->name &&
+ (!strcasecmp(evsel->name, "slots") ||
+ !strncasecmp(evsel->name, "topdown", 7));
+}
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 52ea004ba01e..dfa65a383502 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1790,8 +1790,17 @@ struct evsel *evlist__reset_weak_group(struct evlist *evsel_list, struct evsel *
if (evsel__has_leader(c2, leader)) {
if (is_open && close)
perf_evsel__close(&c2->core);
- evsel__set_leader(c2, c2);
- c2->core.nr_members = 0;
+ /*
+ * We want to close all members of the group and reopen
+ * them. Some events, like Intel topdown, require being
+ * in a group and so keep these in the group.
+ */
+ if (!evsel__must_be_in_group(c2) && c2 != leader) {
+ evsel__set_leader(c2, c2);
+ c2->core.nr_members = 0;
+ leader->core.nr_members--;
+ }
+
/*
* Set this for all former members of the group
* to indicate they get reopened.
@@ -1799,6 +1808,9 @@ struct evsel *evlist__reset_weak_group(struct evlist *evsel_list, struct evsel *
c2->reset_group = true;
}
}
+ /* Reset the leader count if all entries were removed. */
+ if (leader->core.nr_members)
+ leader->core.nr_members = 0;
return leader;
}

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index d38722560e80..b7c0c9775673 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -3082,3 +3082,13 @@ int evsel__source_count(const struct evsel *evsel)
}
return count;
}
+
+bool __weak arch_evsel__must_be_in_group(const struct evsel *evsel __maybe_unused)
+{
+ return false;
+}
+
+bool evsel__must_be_in_group(const struct evsel *evsel)
+{
+ return arch_evsel__must_be_in_group(evsel);
+}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 45d674812239..0ed2850b7ebb 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -488,6 +488,9 @@ bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
bool evsel__is_leader(struct evsel *evsel);
void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
int evsel__source_count(const struct evsel *evsel);
+bool evsel__must_be_in_group(const struct evsel *evsel);
+
+bool arch_evsel__must_be_in_group(const struct evsel *evsel);

/*
* Macro to swap the bit-field postition and size.