[patch, 2.6.10-rc3] safe_hlt() & NMIs

From: Ingo Molnar
Date: Tue Dec 14 2004 - 18:52:31 EST



* Lee Revell <rlrevell@xxxxxxxxxxx> wrote:

> On Sun, 2004-12-12 at 13:15 +0100, Andrea Arcangeli wrote:
> > Overall this is a very minor issue (unless HZ is 0), it would only
> > introduce a 1/HZ latency to the irq that get posted while the nmi
> > handler is running, and the nmi handlers never runs in production.
>
> Ingo, couldn't this account for some of the inexplicable outliers some
> people were seeing in latency tests?

indeed, there could be a connection, and it's certainly a fun race. The
proper fix is Manfred's suggestion: check whether the EIP is a kernel
text address, and if yes, whether it's a HLT instruction - and if yes
then increase EIP by 1. I've included the fix in the -33-02 -RT patch.
Andrew, Linus: upstream fix is below - i think it's post-2.6.10 stuff.
Tested it on SMP and UP x86, using both the IO-APIC and the local-APIC
based NMI watchdog.

i think x64 needs a similar fix as well.

Ingo

--- linux/arch/i386/kernel/traps.c.orig
+++ linux/arch/i386/kernel/traps.c
@@ -670,6 +670,17 @@ fastcall void do_nmi(struct pt_regs * re

cpu = smp_processor_id();

+ /*
+ * Fix up obscure CPU behavior: if we interrupt safe_hlt() via
+ * the NMI then we might miss a reschedule if an interrupt is
+ * posted to the CPU and executes before the HLT instruction.
+ *
+ * We check whether the EIP is kernelspace, and if yes, whether
+ * the instruction is HLT:
+ */
+ if (__kernel_text_address(regs->eip) && *(char *)regs->eip == 0xf4)
+ regs->eip++;
+
#ifdef CONFIG_HOTPLUG_CPU
if (!cpu_online(cpu)) {
nmi_exit();
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/