Re: Dual XEON - >>SLOW<< on SMP

From: Keith Owens (kaos@ocs.com.au)
Date: Tue Nov 07 2000 - 17:41:54 EST


On Tue, 7 Nov 2000 17:31:19 -0500 (EST),
"Richard B. Johnson" <root@chaos.analogic.com> wrote:
>Also, I get some CPU watchdog timeout that I didn't ask for Grrr...
>
>Nov 7 17:17:54 chaos nmbd[115]: Samba server CHAOS is now a domain master browser for workgroup LINUX on subnet 204.178.40.224
>Nov 7 17:17:54 chaos nmbd[115]:
>Nov 7 17:17:54 chaos nmbd[115]: *****
>Nov 7 17:18:54 chaos kernel: NMI Watchdog detected LOCKUP on CPU0, registers:
>Nov 7 17:18:54 chaos kernel: CPU: 0
>Nov 7 17:19:01 chaos login: ROOT LOGIN ON tty2

Which means that one of the cpus is spinning for 5 seconds with
interrupts disabled. CPU watchdogs are *good*.

>
> CPU0 CPU1
> 0: 10945 11869 IO-APIC-edge timer
> 1: 419 393 IO-APIC-edge keyboard
> 2: 0 0 XT-PIC cascade
> 8: 0 0 IO-APIC-edge rtc
> 10: 2990 2904 IO-APIC-level eth0
> 11: 1066 1124 IO-APIC-level BusLogic BT-958
> 13: 0 0 XT-PIC fpu
>NMI: 22748 22748
>LOC: 21731 22229
>ERR: 0
>
>
>The NMI and LOC (timers) run faster than timer channel 0. This
>cannot be correct. Anybody know what this is and how to get
>rid of these CPU time stealers?

The timer is directed both as a normal interrupt 0 and as a broadcast
non maskable interrupt. The NMI count on each cpu should be roughly
the sum of the interrupt 0 count across all cpus.

The NMI path is fairly fast so the overhead is small. When it does
trip you have a problem, a cpu is spinning for far too long. Extract
the NMI report from the log, run it through ksymoops and mail the
decoded result.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Nov 07 2000 - 21:00:23 EST