Re: Linux 2.2.2pre4 SMP crash (wait_on_bh)

Jordan Mendelson (jordy@wserv.com)
Sat, 20 Feb 1999 22:14:01 -0500


Simon Kirby wrote:
>
> On Sat, 20 Feb 1999, Jordan Mendelson wrote:
>
> > Andrea Arcangeli wrote:
> > >
> > > On Sat, 20 Feb 1999, Terry Katz wrote:
> > >
> > > >I've experienced the same problem... 2.2.1, and 2.2.2pre2... There have been
> > > >a number of people who have brought this up... No one has really been
> > > >helpful in an answer as to how to fix. Linus suggested 2.2.2pre2 to me a
> > > >short time ago, but just recently it happened with that so ...
> > >
> > > Are you using the shaper. The shaper had a SMP race that could cause
> > > exactly this kind of SMP hang. I just fixed it and you can find my fix in
> > > both linux-kernel and linux-net.
> >
> > No traffic shaper here.. my current running config is:
>
> I noticed you are using a buslogic SCSI controller -- the machine that is
> having problems here is using an AIC7xxx, so perhaps the low level driver
> isn't the problem. What Ethernet card are you using? I noticed
> "EEPRO100" in your conf, which is what we are using here.

Sounds like the eepro100 is the common factor in 3 cases of this bug
I've seen so far. I'll just go through the entire config.

Here is /proc/pci:

PCI devices found:
Bus 0, device 0, function 0:
Host bridge: Intel 440LX - 82443LX PAC Host (rev 3).
Medium devsel. Fast back-to-back capable. Master Capable.
Latency=64.
Prefetchable 32 bit memory at 0xe4000000 [0xe4000008].
Bus 0, device 1, function 0:
PCI bridge: Intel 440LX - 82443LX PAC AGP (rev 3).
Medium devsel. Fast back-to-back capable. Master Capable.
Latency=64. Min Gnt=7.
Bus 0, device 7, function 0:
ISA bridge: Intel 82371AB PIIX4 ISA (rev 1).
Medium devsel. Fast back-to-back capable. Master Capable. No
bursts.
Bus 0, device 7, function 1:
IDE interface: Intel 82371AB PIIX4 IDE (rev 1).
Medium devsel. Fast back-to-back capable. Master Capable.
Latency=64.
I/O at 0xffa0 [0xffa1].
Bus 0, device 7, function 2:
USB Controller: Intel 82371AB PIIX4 USB (rev 1).
Medium devsel. Fast back-to-back capable. Master Capable. No
bursts.
I/O at 0x6000 [0x6001].
Bus 0, device 7, function 3:
Bridge: Intel 82371AB PIIX4 ACPI (rev 1).
Medium devsel. Fast back-to-back capable.
Bus 0, device 8, function 0:
SCSI storage controller: BusLogic MultiMaster (rev 8).
Fast devsel. IRQ 16. Master Capable. Latency=64. Min Gnt=8.Max
Lat=8.
I/O at 0xdff0 [0xdff1].
Non-prefetchable 32 bit memory at 0xeffff000 [0xeffff000].
Bus 0, device 9, function 0:
Ethernet controller: Intel 82557 (rev 5).
Medium devsel. Fast back-to-back capable. IRQ 17. Master
Capable. Latency=64. Min Gnt=8.Max Lat=56.
Prefetchable 32 bit memory at 0xeb9ff000 [0xeb9ff008].
I/O at 0xdf00 [0xdf01].
Non-prefetchable 32 bit memory at 0xefe00000 [0xefe00000].
Bus 0, device 10, function 0:
VGA compatible controller: S3 Inc. Vision 868 (rev 0).
Medium devsel.
Non-prefetchable 32 bit memory at 0x80000000 [0x80000000].

/proc/interrupts:
CPU0 CPU1
0: 2122506 2121315 IO-APIC-edge timer
1: 888 878 IO-APIC-edge keyboard
2: 0 0 XT-PIC cascade
8: 0 1 IO-APIC-edge rtc
13: 1 0 XT-PIC fpu
16: 135067 135339 IO-APIC-level BusLogic BT-948
17: 1541493 1540939 IO-APIC-level Intel EtherExpress Pro
10/100 Ethernet
NMI: 0
ERR: 0

/proc/ioports:

0000-001f : dma1
0020-003f : pic1
0040-005f : timer
0060-006f : keyboard
0070-007f : rtc
0080-008f : dma page reg
00a0-00bf : pic2
00c0-00df : dma2
00f0-00ff : fpu
02f8-02ff : serial(auto)
03c0-03df : vga+
03f8-03ff : serial(auto)
df00-df1f : Intel Speedo3 Ethernet
dff0-dff3 : BusLogic BT-948

/proc/devices:

Character devices:
1 mem
2 pty
3 ttyp
4 ttyS
5 cua
7 vcs
9 st
10 misc
36 netlink
128 ptm
136 pts

Block devices:
2 fd
8 sd

/proc/cpuinfo:

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 3
model name : Pentium II (Klamath)
stepping : 4
cpu MHz : 267.278723
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov mmx
bogomips : 266.24

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 3
model name : Pentium II (Klamath)
stepping : 4
cpu MHz : 267.278723
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov mmx
bogomips : 267.06

(odd that the bogomips are different on each processor even though the
mhz are the same)

/proc/mtrr:

reg00: base=0x00000000 ( 0MB), size= 256MB: write-back, count=1

> Are you running a lot of net traffic through the machine? This machine
> that has had the problem 4 times in the last month or so sends ands
> receives about 500,000 packets per hour. It sends and receives mail,
> handles POP3 clients, runs a big nameserver, and has a few people logged
> in occasionally to read mail.

Yes, this machine acts as our transparent squid server and also hosts a
few high-traffic web sites. However the machine died at 5:30 AM, which
is an odd time for most people to be accessing this machine.

This box was stable, it just recently started to crash with 2.2.x's with
this particular error, so it seems to be a kernel problem as far as I
can tell.

Jordan

--
Jordan Mendelson     : http://jordy.wserv.com
Web Services, Inc.   : http://www.wserv.com

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/