Re: [PATCH 00/12] perf: more fixes

From: Peter Zijlstra
Date: Wed Mar 16 2016 - 19:11:25 EST


On Wed, Mar 16, 2016 at 11:59:33PM +0100, Peter Zijlstra wrote:
> Subject: perf, ibs: Fix race with IBS_STARTING state
> From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Date: Wed Mar 16 23:55:21 CET 2016
>
> While tracing the IBS bits I saw the NMI hitting between clearing
> IBS_STARTING and the actual MSR writes to disable the counter.
>
> Since IBS_STARTING was cleared, the handler assumed these were spurious
> NMIs and because STOPPING wasn't set yet either, insta-triggered an
> "Unknown NMI".
>
> Cure this by clearing IBS_STARTING after disabling the hardware.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> ---
> arch/x86/events/amd/ibs.c | 32 +++++++++++++++++++++++++++++---
> 1 file changed, 29 insertions(+), 3 deletions(-)
>
> --- a/arch/x86/events/amd/ibs.c
> +++ b/arch/x86/events/amd/ibs.c
> @@ -376,7 +376,13 @@ static void perf_ibs_start(struct perf_e
> hwc->state = 0;
>
> perf_ibs_set_period(perf_ibs, hwc, &period);
> + /*
> + * Set STARTED before enabling the hardware, such that
> + * a subsequent NMI must observe it. Then clear STOPPING
> + * such that we don't consume NMIs by accident.
> + */
> set_bit(IBS_STARTED, pcpu->state);
> + clear_bit(IBS_STOPPING, pcpu->state);
> perf_ibs_enable_event(perf_ibs, hwc, period >> 4);

Also, all those atomic ops are probably entirely overkill and we could
use the non-atomic ops. This is all strictly cpu local. But I didn't
want to change too much at once, esp. while there's still problems.