Re: [PATCH] perf metricgroup: Fix for metrics containing duration_time

From: John Garry
Date: Wed Jan 20 2021 - 12:15:03 EST


On 20/01/2021 16:40, Ian Rogers wrote:
On Wed, Jan 20, 2021 at 8:23 AM John Garry <john.garry@xxxxxxxxxx <mailto:john.garry@xxxxxxxxxx>> wrote:

Metrics containing duration_time cause a segfault:

$./perf stat -v -M L1D_Cache_Fill_BW sleep 1
Using CPUID GenuineIntel-6-3D-4
metric expr 64 * l1d.replacement / 1000000000 / duration_time for
L1D_Cache_Fill_BW
found event duration_time
found event l1d.replacement
adding {l1d.replacement}:W,duration_time
l1d.replacement -> cpu/umask=0x1,(null)=0x1e8483,event=0x51/
Segmentation fault

In commit c2337d67199a ("perf metricgroup: Fix metrics using aliases
covering multiple PMUs"), the logic in find_evsel_group() when iter'ing
events was changed to not only select events in same group, but also for
aliased PMUs.

Checking whether events were for aliased PMUs was done by comparing the
event PMU name. This was not safe for duration_time event, which has no
associated PMU (and no PMU name), so fix by checking if the event
PMU name
is set also.


Thanks for this, it should be fairly easy to add a test. Could we do this?

I don't mind following up with that.


Fixes: c2337d67199a ("perf metricgroup: Fix metrics using aliases
covering multiple PMUs")
Reported-by: Joakim Zhang <qiangqing.zhang@xxxxxxx
<mailto:qiangqing.zhang@xxxxxxx>>
Signed-off-by: John Garry <john.garry@xxxxxxxxxx
<mailto:john.garry@xxxxxxxxxx>>

diff --git a/tools/perf/util/metricgroup.c
b/tools/perf/util/metricgroup.c
index 2e60ee170abc..e6d3452031e5 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -162,6 +162,14 @@ static bool contains_event(struct evsel
**metric_events, int num_events,
        return false;
 }

+static bool evsel_same_pmu(struct evsel *ev1, struct evsel *ev2)
+{
+       if (!ev1->pmu_name || !ev2->pmu_name)
+               return false;


What about the case of "!ev1->pmu_name && !ev2->pmu_name" ?

As far as I know, it should not happen, since duration_time is a special event. More below.


Thanks,
Ian

+
+       return !strcmp(ev1->pmu_name, ev2->pmu_name);
+}
+
 /**
  * Find a group of events in perf_evlist that correspond to those
from a parsed
  * metric expression. Note, as find_evsel_group is called in the
same order as
@@ -280,8 +288,7 @@ static struct evsel *find_evsel_group(struct
evlist *perf_evlist,
                         */
                        if (!has_constraint &&
                            ev->leader != metric_events[i]->leader &&
-                           !strcmp(ev->leader->pmu_name,
-                                   metric_events[i]->leader->pmu_name))
+                           evsel_same_pmu(ev->leader,
metric_events[i]->leader))

ev->leader->pmu_name == NULL for only duration_time event. And we don't get here for ev == metric_events[i] == duration_time event (as we use evlist__for_each_entry_continue() and duration_time is always last in metric_events[]), so both event arguments should not have pmu_name == NULL. Indeed, I could just check metric_events[i]->leader->pmu_name != NULL, but thought it better to check both for safety.

Cheers,
John

                                break;
                        if (!strcmp(metric_events[i]->name,
ev->name)) {
                                set_bit(ev->idx, evlist_used);
-- 2.26.2