Re: [PATCH] x86: fix duplicate calls of the nmi handler

From: Robert Richter
Date: Tue Sep 14 2010 - 13:45:23 EST


On 13.09.10 16:37:13, Robert Richter wrote:
> Ingo, Peter,
>
> I finally found a system here, will start debugging

I found the reason why we get the unknown nmi. For some reason
cpuc->active_mask in x86_pmu_handle_irq() is zero. Thus, no counters
are handled when we get an nmi. It seems there is somewhere a race
accessing the active_mask. So far I don't have a fix available.
Changing x86_pmu_stop() did not help:

static void x86_pmu_stop(struct perf_event *event, int flags)
{
struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
struct hw_perf_event *hwc = &event->hw;

if (test_bit(hwc->idx, cpuc->active_mask)) {
x86_pmu.disable(event);
__clear_bit(hwc->idx, cpuc->active_mask);
cpuc->events[hwc->idx] = NULL;
WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED);
hwc->state |= PERF_HES_STOPPED;
}
...
}

-Robert

--
Advanced Micro Devices, Inc.
Operating System Research Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/