Re: [patch 05/24] perfmon: X86 generic code (x86)

From: stephane eranian
Date: Thu Nov 27 2008 - 07:04:53 EST


Peter,

On Thu, Nov 27, 2008 at 12:52 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Thu, 2008-11-27 at 12:35 +0100, stephane eranian wrote:
>> On Thu, Nov 27, 2008 at 12:31 PM, Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
>> >> The only reason why I have to deal with NMI is not so much to allow
>> >> for profiling irq-off regions but because I have to share the PMU with
>> >> the NMI watchdog. Otherwise I'd have to fail or disable the NMI watchdog
>> >> on the fly.
>> >
>> > The NMI watchdog is now off by default so failing with it enabled
>> > is fine.
>>
>> Yes, but most likely it is on in distro kernels.
>
> So? You can disable it on the fly when there is a perfmon user.
>
Yes, you can. There is clearly an interface to do this. I think this is the
best solution. I know it can work because it experimented with this approach
no later than last month. But I ran into a bug which I reported on LKML. I did
not provide a patch because I did not fully understand the connection to
suspend/resume.

The bug has to do with some obscure suspend/resume sequence in:

void setup_apic_nmi_watchdog(void *unused)
{
if (__get_cpu_var(wd_enabled))
return;

/* cheap hack to support suspend/resume */
/* if cpu0 is not active neither should the other cpus */
if (smp_processor_id() != 0 && atomic_read(&nmi_active) <= 0)
return;

Basically, when you re-enable the NMI watchdog, it is not always re-enabled
correctly on all CPUs, it depends on the order if which they process the IPI.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/