[PATCH] perf: sched out groups atomically

From: Mark Rutland
Date: Tue Jul 26 2016 - 13:12:34 EST


Groups of events are supposed to be scheduled atomically, such that it
is possible to derive meaningful ratios between their values.

We take great pains to achieve this when scheduling event groups to a
PMU in group_sched_in(), calling {start,commit}_txn() (which fall back
to perf_pmu_{disable,enable}() if necessary) to provide this guarantee.
However we don't mirror this in group_sched_out(), and in some cases
events will not be scheduled out atomically.

For example, if we disable an event group with PERF_EVENT_IOC_DISABLE,
we'll cross-call __perf_event_disable() for the group leader, and will
call group_sched_out() without having first disabled the relevant PMU.
We will disable/enable the PMU around each pmu->del() call, but between
each call the PMU will be enabled and events may count.

Avoid this by explicitly disabling and enabling the PMU around event
removal in group_sched_out(), mirroring what we do in group_sched_in().

Signed-off-by: Mark Rutland <mark.rutland@xxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
Cc: linux-kernel@xxxxxxxxxxxxxxx
---
kernel/events/core.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 68cac68..c4a0ec3 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1770,6 +1770,8 @@ group_sched_out(struct perf_event *group_event,
struct perf_event *event;
int state = group_event->state;

+ perf_pmu_disable(ctx->pmu);
+
event_sched_out(group_event, cpuctx, ctx);

/*
@@ -1778,6 +1780,8 @@ group_sched_out(struct perf_event *group_event,
list_for_each_entry(event, &group_event->sibling_list, group_entry)
event_sched_out(event, cpuctx, ctx);

+ perf_pmu_enable(ctx->pmu);
+
if (state == PERF_EVENT_STATE_ACTIVE && group_event->attr.exclusive)
cpuctx->exclusive = 0;
}
--
1.9.1