[tip: perf/urgent] perf/x86/intel: Fix a warning on x86_pmu_stop() with large PEBS

From: tip-bot2 for Namhyung Kim
Date: Thu Dec 03 2020 - 04:08:37 EST


The following commit has been merged into the perf/urgent branch of tip:

Commit-ID: 5debf02131227d39988e44adf5090fb796fa8466
Gitweb: https://git.kernel.org/tip/5debf02131227d39988e44adf5090fb796fa8466
Author: Namhyung Kim <namhyung@xxxxxxxxxx>
AuthorDate: Thu, 26 Nov 2020 20:09:21 +09:00
Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
CommitterDate: Thu, 03 Dec 2020 10:00:26 +01:00

perf/x86/intel: Fix a warning on x86_pmu_stop() with large PEBS

The commit 3966c3feca3f ("x86/perf/amd: Remove need to check "running"
bit in NMI handler") introduced this. It seems x86_pmu_stop can be
called recursively (like when it losts some samples) like below:

x86_pmu_stop
intel_pmu_disable_event (x86_pmu_disable)
intel_pmu_pebs_disable
intel_pmu_drain_pebs_nhm (x86_pmu_drain_pebs_buffer)
x86_pmu_stop

While commit 35d1ce6bec13 ("perf/x86/intel/ds: Fix x86_pmu_stop
warning for large PEBS") fixed it for the normal cases, there's
another path to call x86_pmu_stop() recursively when a PEBS error was
detected (like two or more counters overflowed at the same time).

Like in the Kan's previous fix, we can skip the interrupt accounting
for large PEBS, so check the iregs which is set for PMI only.

Fixes: 3966c3feca3f ("x86/perf/amd: Remove need to check "running" bit in NMI handler")
Reported-by: John Sperbeck <jsperbeck@xxxxxxxxxx>
Suggested-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Link: https://lkml.kernel.org/r/20201126110922.317681-1-namhyung@xxxxxxxxxx
---
arch/x86/events/intel/ds.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index b47cc42..89dba58 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1940,7 +1940,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
if (error[bit]) {
perf_log_lost_samples(event, error[bit]);

- if (perf_event_account_interrupt(event))
+ if (iregs && perf_event_account_interrupt(event))
x86_pmu_stop(event, 0);
}