Re: 2.3.99-pre3 hangs on netfinity 5500

From: H. Peter Anvin (hpa@transmeta.com)
Date: Sun Mar 26 2000 - 18:50:45 EST


Followup to: <38DE6B72.22635C41@colorfullife.com>
By author: Manfred Spraul <manfreds@colorfullife.com>
In newsgroup: linux.dev.kernel
> >
> > I'm having strange lookups on an IBM netfinity 5500 using kernel version
> > 2.3.99pre-3 (the problem exists at least since 2.3.50). The machine freezes
> > suddenly. The behaviour seems independant of load or network traffic.
> > Any idea how I could diagnose this problem ?
> >
> I had a similar problem:
> when I boot with one cpu, apic enabled, then I loose interrupts, and the
> kernel locks after a few seconds.
>
> Could you try to modify the smp_affinity_masks?
>
> echo 1 > /proc/irq/*/smp_affinity_mask
>

I will try this as well; I get a ton of errors and lockup within 24
hours running 2.3.99-pre3 on a dual PIII machine:

APIC error interrupt on CPU#0, should never happen.
... APIC ESR0: 00000002
... APIC ESR1: 0000000a
... bit 1: APIC Receive CS Error (hw problem).
... bit 3: APIC Receive Accept Error.
APIC error interrupt on CPU#0, should never happen.
... APIC ESR0: 00000008
... APIC ESR1: 00000008
... bit 3: APIC Receive Accept Error.
APIC error interrupt on CPU#0, should never happen.
... APIC ESR0: 00000008
... APIC ESR1: 0000000a
... bit 1: APIC Receive CS Error (hw problem).
... bit 3: APIC Receive Accept Error.
APIC error interrupt on CPU#1, should never happen.
... APIC ESR0: 00000004
... APIC ESR1: 00000004
... bit 2: APIC Send Accept Error.
APIC error interrupt on CPU#1, should never happen.
... APIC ESR0: 00000004
... APIC ESR1: 00000004
... bit 2: APIC Send Accept Error.

Indicentally, ls /proc/irq lists every entry more than once:

0/ 10/ 12/ 15/ 16/ 18/ 19/ 4/ 6/ 8/ prof_cpu_mask*
1/ 11/ 13/ 15/ 17/ 18/ 2/ 5/ 7/ 9/
1/ 12/ 14/ 16/ 17/ 19/ 3/ 6/ 8/ 9/

/proc/interrupts:

          CPU0 CPU1
  0: 94166 103658 IO-APIC-edge timer
  1: 3849 4047 IO-APIC-edge keyboard
  2: 0 0 XT-PIC cascade
  8: 1 0 IO-APIC-edge rtc
  9: 0 0 XT-PIC acpi
 12: 30090 27648 IO-APIC-edge PS/2 Mouse
 13: 1 0 XT-PIC fpu
 15: 3 3 IO-APIC-edge ide1
 16: 0 0 IO-APIC-level es1371
 17: 35657 35866 IO-APIC-level sym53c8xx
 18: 5 6 IO-APIC-level advansys
 19: 47892 48524 IO-APIC-level eth0
NMI: 197689 197689
LOC: 197717 197717
ERR: 2757

Trying to "echo 1 > X/smp_affinity" where X is 2, 5, 9, 10, 11 or 13 kills the process.

       -hpa

-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Mar 31 2000 - 21:00:18 EST