Re: [PATCH V2 3/5] perf/x86/intel: Filter unsupported Topdown metrics event

From: Peter Zijlstra
Date: Wed Jan 27 2021 - 14:14:38 EST


On Wed, Jan 27, 2021 at 07:38:43AM -0800, kan.liang@xxxxxxxxxxxxxxx wrote:
> From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
>
> Current perf doesn't check the index of a Topdown metrics event before
> updating the event. A perf tool user may get a value from an unsupported
> Topdown metrics event.
>
> For example, the L2 topdown event, cpu/event=0x00,umask=0x84/, is not
> supported on Ice Lake. A perf tool user may mistakenly use the
> unsupported events via RAW format. In this case, the scheduler follows
> the unknown event handling and assigns a GP counter to the event. The
> icl_get_topdown_value() returns the value of the slots to the event.
> The perf tool user will get the value for the unsupported
> Topdown metrics event.
>
> Add a check in the __icl_update_topdown_event() and avoid updating
> unsupported events.

I was struggling trying to understand how we end up here. Because
userspace can add whatever crap it wants, and why is only this thing a
problem..

But the actual problem is the next patch changing INTEL_TD_METRIC_NUM,
which then means is_metric_idx() will change, and that means that for
ICL we'll accept these raw events as metric events on creation and then
at runtime we get into trouble.

This isn't spelled out.

I do think this is entirely the wrong fix for that though. You're now
adding cycles to the relative hot path, instead of fixing the problem at
event creation, which is the slow path.

Why can't we either refuse the event on ICL or otherwise wreck it on
construction to avoid getting into trouble here?

> Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> ---
> arch/x86/events/intel/core.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 8eba41b..b02d482 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -2319,6 +2319,17 @@ static void __icl_update_topdown_event(struct perf_event *event,
> {
> u64 delta, last = 0;
>
> + /*
> + * Although the unsupported topdown events are not exposed to users,
> + * users may mistakenly use the unsupported events via RAW format.
> + * For example, using L2 topdown event, cpu/event=0x00,umask=0x84/,
> + * on Ice Lake. In this case, the scheduler follows the unknown
> + * event handling and assigns a GP counter to the event.
> + * Check the case, and avoid updating unsupported events.
> + */
> + if (event->hw.idx < INTEL_PMC_IDX_FIXED)
> + return;
> +
> delta = icl_get_topdown_value(event, slots, metrics);
> if (last_slots)
> last = icl_get_topdown_value(event, last_slots, last_metrics);
> --
> 2.7.4
>