Re: [PATCH V2 3/5] perf/x86/intel: Filter unsupported Topdown metrics event

From: Liang, Kan
Date: Wed Jan 27 2021 - 15:52:35 EST




On 1/27/2021 2:13 PM, Peter Zijlstra wrote:
On Wed, Jan 27, 2021 at 07:38:43AM -0800, kan.liang@xxxxxxxxxxxxxxx wrote:
From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>

Current perf doesn't check the index of a Topdown metrics event before
updating the event. A perf tool user may get a value from an unsupported
Topdown metrics event.

For example, the L2 topdown event, cpu/event=0x00,umask=0x84/, is not
supported on Ice Lake. A perf tool user may mistakenly use the
unsupported events via RAW format. In this case, the scheduler follows
the unknown event handling and assigns a GP counter to the event. The
icl_get_topdown_value() returns the value of the slots to the event.
The perf tool user will get the value for the unsupported
Topdown metrics event.

Add a check in the __icl_update_topdown_event() and avoid updating
unsupported events.

I was struggling trying to understand how we end up here. Because
userspace can add whatever crap it wants, and why is only this thing a
problem..

But the actual problem is the next patch changing INTEL_TD_METRIC_NUM,
which then means is_metric_idx() will change, and that means that for
ICL we'll accept these raw events as metric events on creation and then
at runtime we get into trouble.

This isn't spelled out.

I do think this is entirely the wrong fix for that though. You're now
adding cycles to the relative hot path, instead of fixing the problem at
event creation, which is the slow path.

Why can't we either refuse the event on ICL or otherwise wreck it on
construction to avoid getting into trouble here?


To reject the unsupported topdown events, I think perf needs to know the number of the supported topdown events. Maybe we can add a variable num_topdown_events in x86_pmu as below. Is it OK?

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 3d6fdcf..c7f2602 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1061,6 +1061,18 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
return unsched ? -EINVAL : 0;
}

+#define INTEL_TD_METRIC_AVAILABLE_MAX (INTEL_TD_METRIC_RETIRING + \
+ ((x86_pmu.num_topdown_events - 1) << 8))
+
+inline bool is_metric_event(struct perf_event *event)
+{
+ u64 config = event->attr.config;
+
+ return ((config & ARCH_PERFMON_EVENTSEL_EVENT) == 0) &&
+ ((config & INTEL_ARCH_EVENT_MASK) >= INTEL_TD_METRIC_RETIRING) &&
+ ((config & INTEL_ARCH_EVENT_MASK) <= INTEL_TD_METRIC_AVAILABLE_MAX);
+}
+
static int add_nr_metric_event(struct cpu_hw_events *cpuc,
struct perf_event *event)
{
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index cd4d542..eab1eba 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5769,6 +5769,7 @@ __init int intel_pmu_init(void)
x86_pmu.rtm_abort_event = X86_CONFIG(.event=0xc9, .umask=0x04);
x86_pmu.lbr_pt_coexist = true;
intel_pmu_pebs_data_source_skl(pmem);
+ x86_pmu.num_topdown_events = 4;
x86_pmu.update_topdown_event = icl_update_topdown_event;
x86_pmu.set_topdown_event_period = icl_set_topdown_event_period;
pr_cont("Icelake events, ");
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 15977d0..8b05893 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -87,14 +87,7 @@ static inline bool is_topdown_count(struct perf_event *event)
return event->hw.flags & PERF_X86_EVENT_TOPDOWN;
}

-static inline bool is_metric_event(struct perf_event *event)
-{
- u64 config = event->attr.config;
-
- return ((config & ARCH_PERFMON_EVENTSEL_EVENT) == 0) &&
- ((config & INTEL_ARCH_EVENT_MASK) >= INTEL_TD_METRIC_RETIRING) &&
- ((config & INTEL_ARCH_EVENT_MASK) <= INTEL_TD_METRIC_MAX);
-}
+inline bool is_metric_event(struct perf_event *event);

static inline bool is_slots_event(struct perf_event *event)
{
@@ -782,6 +775,7 @@ struct x86_pmu {
/*
* Intel perf metrics
*/
+ int num_topdown_events;
u64 (*update_topdown_event)(struct perf_event *event);
int (*set_topdown_event_period)(struct perf_event *event);


Thanks,
Kan