Re: [patch 06/11] x86: nmi_32/64.c - use apic_write_around insteadof apic_write

From: Maciej W. Rozycki
Date: Wed May 28 2008 - 13:48:40 EST


On Wed, 28 May 2008, Cyrill Gorcunov wrote:

> Thanks a lot, Maciej!!! Could you please explain me how did you find
> that? 'cause reporter said that with nmi_watchdog=2 it works and with
> nmi_watchdog=1 it stalls? Maybe I should better make this function
> the same as 64bit version has? I.e. set nmi_watchdog = NMI_NONE by default?

Well, nmi_watchdog=1 is the I/O APIC watchdog and if no watchdog has been
specified at the command line, the piece of code you have moved selects
between the local and the I/O APIC watchdog based on availability of the
former. So in this case the local watchdog must have been unavailable as
it works if requested explicitly.

No piece of code in nmi_watchdog_default() touches peripheral hardware
and native_smp_prepare_cpus() is called early enough the system is still
running UP and no APIC setup has happened yet, so any interference with
running hardware can be excluded.

Random lock-ups are a typical symptom of the NMI watchdog interfering
with SMM firmware -- of course in the context of the watchdog being
suspected in the first place -- there may be plenty of other reasons of
random lock-ups. Obviously this is the SMM firmware asking for trouble
explicitly, because NMIs are disabled by the processor upon entering the
SMM and it is the SMI handler that unmasks the NMI explicitly (with an
IRET, which shouldn't be used in the SMM mode at all) -- otherwise it
wouldn't even notice the watchdog running, but there you go.

As a rule of thumb any piece of firmware that has a possibility to run
from an OS context should not use interrupts of any kind, because it is
quite likely it cannot handle them in the way the OS expects them to be
handled. It is as simple as that, but perhaps too simple for some to
comprehend. :(

Maciej
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/