Re: [RFC][PATCH] perf, intel: Don't touch MSR_IA32_DEBUGCTLMSR fromNMI context

From: Stephane Eranian
Date: Thu Sep 13 2012 - 07:49:06 EST


On Thu, Sep 13, 2012 at 12:23 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Wed, 2012-09-12 at 18:22 +0200, Peter Zijlstra wrote:
>> Oleg and Sebastian found that touching MSR_IA32_DEBUGCTLMSR from NMI
>> context is problematic since the only way to change the various
>> unrelated bits in there is:
>>
>> debugctl = get_debugctlmsr()
>> /* frob flags in debugctl */
>> update_debugctlmsr(debugctl);
>>
>> Which is entirely unsafe if we prod at the MSR from NMI context.
>>
>> In particular the path that is responsible is:
>>
>> x86_pmu_handle_irq() (NMI handler)
>> x86_pmu_stop()
>> x86_pmu.disable -> intel_pmu_disable_event()
>> intel_pmu_lbr_disable()
>> __intel_pmu_lbr_disable()
>> wrmsrl(MSR_IA32_DEBUGCTLMSR,... );
>>
>> So remove intel_pmu_lbr_{en,dis}able() from
>> intel_pmu_{en,dis}able_event() and stick them in
>> intel_{get,put}_event_constraints() which are only ever called from
>> regular contexts.
>>
>> We combine intel_pmu_needs_lbr_smpl(), which tells us if the events
>> wants LBR data, with event->hw.branch_reg.alloc, which given
>> intel_shared_regs_constraints() is set if our event got the shared
>> resource, to give us a correctly balanced condition in
>> {get,put}_event_constraints() for intel_pmu_lbr_{en,dis}able().
>>
>> Also change core_pmu to only use x86_get_event_constraints since it
>> doesn't support any of the fancy DS (BTS,PEBS), LBR and OFFCORE features
>> that make up intel_{get,put}_event_constraints.
>
> OK, so with the added stipulation that touching the MSR from the NMI
> isn't a problem as long as we ensure we leave the MSR in the same state
> that we found it in, and the above is the only path found so far that
> could cause this to be violated, the patch is fine?
>
Should be, though it is pretty ugly to stash all of this in the
put/get constraints.

I will run some tests.

I wonder what this does when you come in to get/put with a fake cpuc. You don't
want to perturb the local lbr which may be in use.

> In particular we could note that both LBR and BTS use the DEBUGCTL MSR,
> but BTS doesn't throttle, so the LBR code is the only thing needing
> attention as per the above description.
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/