Oops in 2.3.99-pre4 and -pre8 during bootup

From: Marc Buchmann (drsludge@home.com)
Date: Thu May 18 2000 - 19:53:50 EST


I've gotten an Oops followed by crash lockup during bootup on both
2.3.99-pre4 and -pre8, at the same spot (immediately after freeing
unused kernel mem, but before init starts). The oops is *not* occuring
in -pre1, -pre3, or 2.3.51 (test booted all 5 kernels to nail down which
versions were causing it). The system locks hard afterwards, no keyboard,
no magic SysRq.

My system is a dual-P2 233 (not OC'd) on a QDI LegenX 4 (440Lx with onboard
AIC7880 and Intel EtherExpress 10/100), running a mostly RedHat 5.2 (with
various bits from 6.0 and 6.2 thrown in). Compiler is egcs-1.1.12

Interestingly, I just installed this mobo on May 12th (Friday). The previous
board (a gigabyte 686-dlx) booted up -pre4 just fine, but because of other
issues (it's a POS board throwing spurious apic interrupts and the like), I
didn't leave it up and running for too long.

I hope this is enough info to track the problem down. If not, the crash
is 100% reproducible (all I have to do is boot up -pre4).

thanks.

Here's the ksymoops output (.config appended afterwards):

ksymoops 2.3.4 on i686 2.2.15. Options used
     -v /usr/src/linux-2.3.99-pre4/vmlinux (specified)
     -K (specified)
     -L (specified)
     -o /lib/modules/2.3.99-pre4 (specified)
     -m /usr/src/linux-2.3.99-pre4/System.map (specified)

No modules in ksyms, skipping objects
Unable to handle kernel paging request at virtual address 656c6f95
c014e405
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c014e405>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 000010212
eax: 00000000 ebx: ffffffd8 ecx: c13f9000 edx: c13f9000
esi: 656c6f73 edi: c13f9000 ebp: 00000000 esp: c1271f28
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 1, stackpage=c1271000)
Stack: 00000000 c13f9000 00000002 c1270000 00000000 c02a4d34 c13f9000 00000006
       c013e0b7 c13f9000 00000003 00000000 c1271f68 c1270000 c1270000 00000000
       c13f9000 c0241365 00000000 00000000 00000000 c013e413 c13f9000 00000000
Call Trace: [<c013e0b7>] [<c0241365>] [<c013e413>] [<c010bf80>]
[<c0241365>] [<c0110018>] [<c0240018>]
            [<c010719c>] [<c010907b>]
Code: 0f b7 46 22 25 00 f0 ff ff 66 3d 00 a0 0f 84 98 01 00 00 bb

>>EIP; c014e405 <open_namei+491/658> <=====
Trace; c013e0b7 <filp_open+3b/5c>
Trace; c0241365 <stext_lock+6635/8efc>
Trace; c013e413 <sys_open+a7/250>
Trace; c010bf80 <system_call+34/38>
Trace; c0241365 <stext_lock+6635/8efc>
Trace; c0110018 <handle_vm86_fault+878/1090>
Trace; c0240018 <stext_lock+52e8/8efc>
Trace; c010719c <init+104/238>
Trace; c010907b <kernel_thread+23/30>
Code; c014e405 <open_namei+491/658>
00000000 <_EIP>:
Code; c014e405 <open_namei+491/658> <=====
   0: 0f b7 46 22 movzwl 0x22(%esi),%eax <=====
Code; c014e409 <open_namei+495/658>
   4: 25 00 f0 ff ff andl $0xfffff000,%eax
Code; c014e40e <open_namei+49a/658>
   9: 66 3d 00 a0 cmpw $0xa000,%ax
Code; c014e412 <open_namei+49e/658>
   d: 0f 84 98 01 00 00 je 1ab <_EIP+0x1ab> c014e5b0
<open_namei+63c/658>
Code; c014e418 <open_namei+4a4/658>
  13: bb 00 00 00 00 movl $0x0,%ebx

NMI Watchdog detected LOCKUP on CPU0, registers:
CPU: 0
EIP: 0010:[<c0123875>]
EFLAGS: 00000086
eax: c1270000 ebx: c12c8000 ecx: c02bc0a8 edx: c02bc0a8
esi: 00000001 edi: c1270000 ebp: c1270000 esp: c1271de4
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 1, stackpage=c1271000)
Stack: c12704fc c12704dc 00000282 c1271e18 c0123ccd c0123d84 00000000 656c6f73
       656c6f95 c1271ee4 00000001 c12704dc c1270000 c1271ee4 c010c4d5 0000000b
       00000000 c01167be c0249e7e c1271ef4 00000000 c1270000 656c6f73 c13f9000
Call Trace: [<c0123ccd>] [<c0123d84>] [<c010c4d5>] [<c01167be>]
[<c0249e7e>] [<c0161ecc>] [<c0162154>]
            [<c01f7e64>] [<c0197e2b>] [<c010c0a9>] [<c014e405>]
[<c013e0b7>] [<c0241365>] [<c013e413>]
            [<c0241365>] [<c0110018>] [<c0240018>] [<c010719c>] [<c010907b>]
Code: 8b 03 83 f8 04 75 0d 8b 43 58 50 53 e8 82 41 ff ff 83 c4 08

>>EIP; c0123875 <exit_notify+231/364> <=====
Trace; c0123ccd <do_exit+325/46c>
Trace; c0123d84 <do_exit+3dc/46c>
Trace; c010c4d5 <die+d5/d8>
Trace; c01167be <do_page_fault+416/528>
Trace; c0249e7e <call_spurious_interrupt+58a5/92db>
Trace; c0161ecc <ext2_read_inode+fc/404>
Trace; c0162154 <ext2_read_inode+384/404>
Trace; c01f7e64 <vgacon_scroll+1e8/210>
Trace; c0197e2b <scrup+6b/108>
Trace; c010c0a9 <error_code+2d/38>
Trace; c014e405 <open_namei+491/658>
Trace; c013e0b7 <filp_open+3b/5c>
Trace; c0241365 <stext_lock+6635/8efc>
Trace; c013e413 <sys_open+a7/250>
Trace; c0241365 <stext_lock+6635/8efc>
Trace; c0110018 <handle_vm86_fault+878/1090>
Trace; c0240018 <stext_lock+52e8/8efc>
Trace; c010719c <init+104/238>
Trace; c010907b <kernel_thread+23/30>
Code; c0123875 <exit_notify+231/364>
00000000 <_EIP>:
Code; c0123875 <exit_notify+231/364> <=====
   0: 8b 03 movl (%ebx),%eax <=====
Code; c0123877 <exit_notify+233/364>
   2: 83 f8 04 cmpl $0x4,%eax
Code; c012387a <exit_notify+236/364>
   5: 75 0d jne 14 <_EIP+0x14> c0123889
<exit_notify+245/364>
Code; c012387c <exit_notify+238/364>
   7: 8b 43 58 movl 0x58(%ebx),%eax
Code; c012387f <exit_notify+23b/364>
   a: 50 pushl %eax
Code; c0123880 <exit_notify+23c/364>
   b: 53 pushl %ebx
Code; c0123881 <exit_notify+23d/364>
   c: e8 82 41 ff ff call ffff4193 <_EIP+0xffff4193>
c0117a08 <notify_parent+0/d0>
Code; c0123886 <exit_notify+242/364>
  11: 83 c4 08 addl $0x8,%esp

And the .config (modules details left out - system never boots far enough to
reach them):

CONFIG_X86=y
CONFIG_ISA=y
CONFIG_UID16=y
CONFIG_EXPERIMENTAL=y
CONFIG_M686=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_TSC=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_PGE=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_NOHIGHMEM=y
CONFIG_MTRR=y
CONFIG_SMP=y
CONFIG_MODULES=y
CONFIG_MODVERSIONS=y
CONFIG_KMOD=y
CONFIG_NET=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_PCI=y
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_NAMES=y
CONFIG_SYSVIPC=y
CONFIG_SYSCTL=y
CONFIG_KCORE_ELF=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_MISC=y
CONFIG_PARPORT=y
CONFIG_PARPORT_PC=y
CONFIG_PARPORT_PC_FIFO=y
CONFIG_PARPORT_1284=y
CONFIG_PNP=y
CONFIG_ISAPNP=y
CONFIG_BLK_DEV_FD=y
CONFIG_BLK_DEV_LOOP=y
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_NETLINK=y
CONFIG_NETLINK_DEV=y
CONFIG_NETFILTER=y
CONFIG_FILTER=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IP_ALIAS=y
CONFIG_SKB_LARGE=y
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
CONFIG_BLK_DEV_IDEDMA_PCI=y
CONFIG_BLK_DEV_IDEDMA=y
CONFIG_IDEDMA_PCI_EXPERIMENTAL=y
CONFIG_SCSI=y
CONFIG_BLK_DEV_SR=y
CONFIG_CHR_DEV_SG=y
CONFIG_SCSI_DEBUG_QUEUES=y
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_AIC7XXX=y
CONFIG_AIC7XXX_TCQ_ON_BY_DEFAULT=y
CONFIG_NETDEVICES=y
CONFIG_NET_ETHERNET=y
CONFIG_NET_PCI=y
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_SERIAL=y
CONFIG_UNIX98_PTYS=y
CONFIG_PRINTER=y
CONFIG_PSMOUSE=y
CONFIG_NVRAM=y
CONFIG_RTC=y
CONFIG_QUOTA=y
CONFIG_AUTOFS4_FS=y
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=y
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_PROC_FS=y
CONFIG_DEVPTS_FS=y
CONFIG_EXT2_FS=y
CONFIG_NFS_V3=y
CONFIG_NFSD_V3=y
CONFIG_LOCKD_V4=y
CONFIG_PARTITION_ADVANCED=y
CONFIG_MSDOS_PARTITION=y
CONFIG_NLS=y
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
CONFIG_VGA_CONSOLE=y
CONFIG_VIDEO_SELECT=y
CONFIG_SOUND_TRACEINIT=y
CONFIG_SOUND_DMAP=y
CONFIG_MAGIC_SYSRQ=y

and the raw crash output, for the morbidly curious:

Freeing unused kernel memory: 180k freed
Unable to handle kernel paging request at virtual address 656c6f95
 printing eip:
c014e405
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c014e405>]
EFLAGS: 000010212
eax: 00000000 ebx: ffffffd8 ecx: c13f9000 edx: c13f9000
esi: 656c6f73 edi: c13f9000 ebp: 00000000 esp: c1271f28
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 1, stackpage=c1271000)
Stack: 00000000 c13f9000 00000002 c1270000 00000000 c02a4d34 c13f9000 00000006
       c013e0b7 c13f9000 00000003 00000000 c1271f68 c1270000 c1270000 00000000
       c13f9000 c0241365 00000000 00000000 00000000 c013e413 c13f9000 00000000
Call Trace: [<c013e0b7>] [<c0241365>] [<c013e413>] [<c010bf80>]
[<c0241365>] [<c0110018>] [<c0240018>]
            [<c010719c>] [<c010907b>]
Code: 0f b7 46 22 25 00 f0 ff ff 66 3d 00 a0 0f 84 98 01 00 00 bb
NMI Watchdog detected LOCKUP on CPU0, registers:
CPU: 0
EIP: 0010:[<c0123875>]
EFLAGS: 00000086
eax: c1270000 ebx: c12c8000 ecx: c02bc0a8 edx: c02bc0a8
esi: 00000001 edi: c1270000 ebp: c1270000 esp: c1271de4
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 1, stackpage=c1271000)
Stack: c12704fc c12704dc 00000282 c1271e18 c0123ccd c0123d84 00000000 656c6f73
       656c6f95 c1271ee4 00000001 c12704dc c1270000 c1271ee4 c010c4d5 0000000b
       00000000 c01167be c0249e7e c1271ef4 00000000 c1270000 656c6f73 c13f9000
Call Trace: [<c0123ccd>] [<c0123d84>] [<c010c4d5>] [<c01167be>]
[<c0249e7e>] [<c0161ecc>] [<c0162154>]
            [<c01f7e64>] [<c0197e2b>] [<c010c0a9>] [<c014e405>]
[<c013e0b7>] [<c0241365>] [<c013e413>]
            [<c0241365>] [<c0110018>] [<c0240018>] [<c010719c>] [<c010907b>]
Code: 8b 03 83 f8 04 75 0d 8b 43 58 50 53 e8 82 41 ff ff 83 c4 08
console shuts up ...

--------------------------------------------
 Words of wisdom to live by:
        Never play leapfrog with a unicorn
--------------------------------------------

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue May 23 2000 - 21:00:16 EST