[tip: perf/urgent] perf: Consider OS filter fail

From: tip-bot2 for Peter Zijlstra
Date: Thu Nov 24 2022 - 04:43:20 EST


The following commit has been merged into the perf/urgent branch of tip:

Commit-ID: 030a976efae83f7b6593afb11a8254d42f9290fe
Gitweb: https://git.kernel.org/tip/030a976efae83f7b6593afb11a8254d42f9290fe
Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
AuthorDate: Sat, 19 Nov 2022 10:45:54 +08:00
Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
CommitterDate: Thu, 24 Nov 2022 10:12:23 +01:00

perf: Consider OS filter fail

Some PMUs (notably the traditional hardware kind) have boundary issues
with the OS filter. Specifically, it is possible for
perf_event_attr::exclude_kernel=1 events to trigger in-kernel due to
SKID or errata.

This can upset the sigtrap logic some and trigger the WARN.

However, if this invalid sample is the first we must not loose the
SIGTRAP, OTOH if it is the second, it must not override the
pending_addr with a (possibly) invalid one.

Fixes: ca6c21327c6a ("perf: Fix missing SIGTRAPs")
Reported-by: Pengfei Xu <pengfei.xu@xxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Reviewed-by: Marco Elver <elver@xxxxxxxxxx>
Tested-by: Pengfei Xu <pengfei.xu@xxxxxxxxx>
Link: https://lkml.kernel.org/r/Y3hDYiXwRnJr8RYG@xxxxxxxxxxxxxxxx
---
kernel/events/core.c | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index f2bb27e..9d15d2d 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9273,6 +9273,19 @@ int perf_event_account_interrupt(struct perf_event *event)
return __perf_event_account_interrupt(event, 1);
}

+static inline bool sample_is_allowed(struct perf_event *event, struct pt_regs *regs)
+{
+ /*
+ * Due to interrupt latency (AKA "skid"), we may enter the
+ * kernel before taking an overflow, even if the PMU is only
+ * counting user events.
+ */
+ if (event->attr.exclude_kernel && !user_mode(regs))
+ return false;
+
+ return true;
+}
+
/*
* Generic event overflow handling, sampling.
*/
@@ -9306,6 +9319,13 @@ static int __perf_event_overflow(struct perf_event *event,
}

if (event->attr.sigtrap) {
+ /*
+ * The desired behaviour of sigtrap vs invalid samples is a bit
+ * tricky; on the one hand, one should not loose the SIGTRAP if
+ * it is the first event, on the other hand, we should also not
+ * trigger the WARN or override the data address.
+ */
+ bool valid_sample = sample_is_allowed(event, regs);
unsigned int pending_id = 1;

if (regs)
@@ -9313,7 +9333,7 @@ static int __perf_event_overflow(struct perf_event *event,
if (!event->pending_sigtrap) {
event->pending_sigtrap = pending_id;
local_inc(&event->ctx->nr_pending);
- } else if (event->attr.exclude_kernel) {
+ } else if (event->attr.exclude_kernel && valid_sample) {
/*
* Should not be able to return to user space without
* consuming pending_sigtrap; with exceptions:
@@ -9330,7 +9350,7 @@ static int __perf_event_overflow(struct perf_event *event,
}

event->pending_addr = 0;
- if (data->sample_flags & PERF_SAMPLE_ADDR)
+ if (valid_sample && (data->sample_flags & PERF_SAMPLE_ADDR))
event->pending_addr = data->addr;
irq_work_queue(&event->pending_irq);
}