RE: 2.2.16pre7 SMP Crashes

From: George Sexton (gsexton@mhsoftware.com)
Date: Wed Jun 07 2000 - 12:58:07 EST


I tried the NMI Ooopser and it didn't seem to work. Perhaps I did it wrong.
I set it for IRQ 1 and it disabled the keyboard. Anyhow, I didn't get an
oops message when it locked up. I did pound on the keyboard after the
lockup. Here is the output captured from the serial console for the bootup.
The .config file follows that.

LILO boot:
Loading linux............
Linux version 2.2.16pre7 (root@server.wcon.org) (gcc version egcs-2.91.66
19990314/Linux (egcs-1.1.2 release)) #4 SMP Wed Jun 7 10:04:59 MDT 2000
Intel MultiProcessor Specification v1.1
    Virtual Wire compatibility mode.
OEM ID: OEM00000 Product ID: PROD00000000 APIC at: 0xFEE00000
Processor #0 Pentium(tm) Pro APIC version 17
Processor #1 Pentium(tm) Pro APIC version 17
I/O APIC #2 Version 17 at 0xFEC00000.
Processors: 2
mapped APIC to ffffe000 (fee00000)
mapped IOAPIC to ffffd000 (fec00000)
Detected 400915 kHz processor.
Console: colour VGA+ 80x50
Calibrating delay loop... 799.54 BogoMIPS
Memory: 257680k/262144k available (1132k kernel code, 420k reserved, 2856k
data, 56k init)
Dentry hash table entries: 32768 (order 6, 256k)
Buffer cache hash table entries: 262144 (order 8, 1024k)
Page cache hash table entries: 65536 (order 6, 256k)
Checking 386/387 coupling... OK, FPU using exception 16 error reporting.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.35a (19990819) Richard Gooch (rgooch@atnf.csiro.au)
per-CPU timeslice cutoff: 100.09 usecs.
CPU0: Intel Pentium II (Deschutes) stepping 02
calibrating APIC timer ...
..... CPU clock speed is 400.9065 MHz.
..... system bus clock speed is 100.2265 MHz.
Booting processor 1 eip 2000
Calibrating delay loop... 801.18 BogoMIPS
OK.
CPU1: Intel Pentium II (Deschutes) stepping 02
Total of 2 processors activated (1600.72 BogoMIPS).
enabling symmetric IO mode... ...done.
ENABLING IO-APIC IRQs
init IO_APIC IRQs
 IO-APIC (apicid-pin) 2-0, 2-17, 2-19, 2-20, 2-21, 2-22, 2-23 not connected.
NMI Watchdog activated on source IRQ 1
number of MP IRQ sources: 20.
number of IO-APIC #2 registers: 24.
testing the IO APIC.......................

IO APIC #2......
.... register #00: 02000000
....... : physical APIC id: 02
.... register #01: 00170011
....... : max redirection entries: 0017
....... : IO APIC version: 0011
.... register #02: 00000000
....... : arbitration: 00
.... IRQ redirection table:
 NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
 00 000 00 1 0 0 0 0 0 0 00
 01 000 00 0 0 0 0 0 1 4 59
 02 0FF 0F 0 0 0 0 0 1 1 51
 03 000 00 0 0 0 0 0 1 1 61
 04 000 00 0 0 0 0 0 1 1 69
 05 000 00 0 0 0 0 0 1 1 71
 06 000 00 0 0 0 0 0 1 1 79
 07 000 00 0 0 0 0 0 1 1 81
 08 000 00 0 0 0 0 0 1 1 89
 09 000 00 0 0 0 0 0 1 1 91
 0a 000 00 0 0 0 0 0 1 1 99
 0b 000 00 0 0 0 0 0 1 1 A1
 0c 000 00 0 0 0 0 0 1 1 A9
 0d 000 00 1 0 0 0 0 0 0 00
 0e 000 00 0 0 0 0 0 1 1 B1
 0f 000 00 0 0 0 0 0 1 1 B9
 10 0FF 0F 1 1 0 1 0 1 1 C1
 11 000 00 1 0 0 0 0 0 0 00
 12 0FF 0F 1 1 0 1 0 1 1 C9
 13 000 00 1 0 0 0 0 0 0 00
 14 000 00 1 0 0 0 0 0 0 00
 15 000 00 1 0 0 0 0 0 0 00
 16 000 00 1 0 0 0 0 0 0 00
 17 000 00 1 0 0 0 0 0 0 00
.................................... done.
checking TSC synchronization across CPUs: passed.
mtrr: your CPUs had inconsistent variable MTRR settings
mtrr: probably your BIOS does not setup all CPUs
PCI: PCI BIOS revision 2.10 entry at 0xfb3f0
PCI: Using configuration type 1
PCI: Probing PCI hardware
PCI->APIC IRQ transform: (B0,I11,P0) -> 16
PCI->APIC IRQ transform: (B0,I17,P0) -> 18
PCI->APIC IRQ transform: (B1,I0,P0) -> 16
Linux NET4.0 for Linux 2.2
Based upon Swansea University Computer Society NET3.039
WAN Router v1.1 (c) 1995-1999 Sangoma Technologies Inc.
NET4: Unix domain sockets 1.0 for Linux NET4.0.
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
TCP: Hash tables configured (ehash 262144 bhash 65536)
Initializing RT netlink socket
Starting kswapd v 1.5
parport0: PC-style at 0x378 [SPP,PS2,EPP]
Detected PS/2 Mouse Port.
Serial driver version 4.27 with<4>Keyboard timeout[2]
 no serial options enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS01 at 0x02f8 (irq = 3) is a 16550A
pty: 256 Unix98 ptys configured
lp0: using parport0 (polling).
Real Time Clock Driver v1.09
PIIX4: IDE controller on PCI bus 00 dev 39
PIIX4: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:pio, hdb:pio
hda: Maxtor 91728D8, ATA DISK drive
hdb: ATAPI CD-ROM DRIVE 40X MAXIMUM, ATAPI CDROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: Maxtor 91728D8, 16479MB w/512kB Cache, CHS=2100/255/63, UDMA
hdb: ATAPI 40X CD-ROM drive, 128kB Cache
Uniform CD-ROM driver Revision: 3.08
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
(scsi0) <Adaptec AIC-7880 Ultra SCSI host adapter> found at PCI 0/11/0
(scsi0) Wide Channel, SCSI ID=7, 16/255 SCBs
(scsi0) Downloading sequencer code... 423 instructions downloaded
scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.1.28/3.2.4
       <Adaptec AIC-7880 Ultra SCSI host adapter>
scsi : 1 host.
(scsi0:0:0:0) Synchronous at 40.0 Mbyte/sec, offset 8.
  Vendor: IBM Model: DDRS-39130W Rev: S97B
  Type: Direct-Access ANSI SCSI revision: 02
Detected scsi disk sda at scsi0, channel 0, id 0, lun 0
(scsi0:0:3:0) Synchronous at 5.0 Mbyte/sec, offset 15.
  Vendor: DEC Model: DLT2000 Rev: 8B37
  Type: Sequential-Access ANSI SCSI revision: 02
  Vendor: IOMEGA Model: ZIP 100 Rev: J.03
  Type: Direct-Access ANSI SCSI revision: 02
Detected scsi removable disk sdb at scsi0, channel 0, id 5, lun 0
scsi : detected 2 SCSI disks total.
SCSI device sda: hdwr sector= 512 bytes. Sectors= 17850000 [8715 MB] [8.7
GB]
sdb : READ CAPACITY failed.
sdb : status = 1, message = 00, host = 0, driver = 28
sdb : extended sense code = 2
sdb : block size assumed to be 512 bytes, disk size 1GB.
PPP: version 2.3.7 (demand dialling)
TCP compression code copyright 1989 Regents of the University of California
PPP line discipline registered.
eepro100.c:v1.09j-t 9/29/99 Donald Becker
http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html
eepro100.c: $Revision: 1.20.2.10 $ 2000/05/31 Modified by Andrey V.
Savochkin <saw@saw.sw.com.sg> and others
eth0: Intel PCI EtherExpress Pro100 82557, 00:90:27:84:9C:17, IRQ 18.
  Receiver lock-up bug exists -- enabling work-around.
  Board assembly 721383-006, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
  General self-test: passed.
  Serial sub-system self-test: passed.
  Internal registers self-test: passed.
  ROM checksum self-test: passed (0x04f4518b).
eepro100.c:v1.09j-t 9/29/99 Donald Becker
http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html
eepro100.c: $Revision: 1.20

CONFIG_EXPERIMENTAL=y

CONFIG_M686=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_TSC=y
CONFIG_X86_GOOD_APIC=y
CONFIG_1GB=y
CONFIG_MTRR=y
CONFIG_SMP=y

CONFIG_MODULES=y

CONFIG_NET=y
CONFIG_PCI=y
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_QUIRKS=y
CONFIG_PCI_OLD_PROC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_SYSVIPC=y
CONFIG_SYSCTL=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_MISC=y
CONFIG_PARPORT=y
CONFIG_PARPORT_PC=y

CONFIG_PNP=y

CONFIG_BLK_DEV_FD=y
CONFIG_BLK_DEV_IDE=y
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_BLK_DEV_IDECD=y
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_BLK_DEV_IDEDMA=y
CONFIG_IDEDMA_AUTO=y
CONFIG_PARIDE_PARPORT=y

CONFIG_PACKET=y
CONFIG_NETLINK=y
CONFIG_RTNETLINK=y
CONFIG_NETLINK_DEV=y
CONFIG_FIREWALL=y
CONFIG_FILTER=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_RTNETLINK=y
CONFIG_NETLINK=y
CONFIG_IP_ROUTE_VERBOSE=y
CONFIG_IP_FIREWALL=y
CONFIG_IP_FIREWALL_NETLINK=y
CONFIG_NETLINK_DEV=y
CONFIG_IP_TRANSPARENT_PROXY=y
CONFIG_IP_MASQUERADE=y
CONFIG_IP_MASQUERADE_ICMP=y
CONFIG_IP_MASQUERADE_MOD=y
CONFIG_IP_MASQUERADE_IPAUTOFW=y
CONFIG_SYN_COOKIES=y
CONFIG_SKB_LARGE=y
CONFIG_WAN_ROUTER=y

CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=m
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_CONSTANTS=y

CONFIG_SCSI_AIC7XXX=y
CONFIG_AIC7XXX_TCQ_ON_BY_DEFAULT=y
CONFIG_AIC7XXX_CMDS_PER_DEVICE=8
CONFIG_AIC7XXX_RESET_DELAY=5

CONFIG_NETDEVICES=y

CONFIG_DUMMY=m

CONFIG_NET_ETHERNET=y
CONFIG_NET_EISA=y
CONFIG_EEXPRESS_PRO100=y

CONFIG_PPP=y

CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_SERIAL=y
CONFIG_SERIAL_CONSOLE=y
CONFIG_UNIX98_PTYS=y
CONFIG_UNIX98_PTY_COUNT=64
CONFIG_PRINTER=y
CONFIG_PRINTER_READBACK=y
CONFIG_MOUSE=y

CONFIG_PSMOUSE=y
CONFIG_82C710_MOUSE=y

CONFIG_WATCHDOG=y

CONFIG_NMI_WATCHDOG=y
CONFIG_NMI_WATCHDOG_IRQ=1
CONFIG_RTC=y

CONFIG_AUTOFS_FS=y
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_VFAT_FS=m
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_PROC_FS=y
CONFIG_DEVPTS_FS=y
CONFIG_EXT2_FS=y

CONFIG_SMB_FS=y

CONFIG_NLS=y

CONFIG_NLS_DEFAULT="cp437"
CONFIG_NLS_CODEPAGE_437=m
CONFIG_NLS_CODEPAGE_850=m

CONFIG_VGA_CONSOLE=y
CONFIG_VIDEO_SELECT=y

CONFIG_MAGIC_SYSRQ=y

-----Original Message-----
From: owner-linux-kernel@vger.rutgers.edu
[mailto:owner-linux-kernel@vger.rutgers.edu]On Behalf Of Marcelo Tosatti
Sent: Friday, June 02, 2000 11:37 AM
To: George Sexton
Cc: linux-kernel@vger.rutgers.edu
Subject: RE: 2.2.16pre7 SMP Crashes

You can try the NMI oopser to possibly get a clue about the crash.
http://people.redhat.com/mingo/NMI-watchdog-patches/NMI-oopser-2.2.15-A0.
Documentation on how to use it is included in the patch.

On Fri, 2 Jun 2000, George Sexton wrote:

> One other data point. The heavily used machine is running Sybase Adaptive
> Server Anywhere, and we are hitting MAJOR performance problems with
> 2.2.16pre7. Queries that used to finish instantly are now taking over 5
> minutes to complete with the dbsrv process sitting pegged at 100% CPU
Usage
> for one processor.
>
> I am going to set it back to 2.2.14 right now until I can get this
resolved.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Jun 07 2000 - 21:00:29 EST