2.1.112: Another page fault from irq handler

Jean Wolter (jean.wolter@inf.tu-dresden.de)
Mon, 10 Aug 1998 18:06:12 +0200


Hello,

today I've found the following error message on the console of our
server (the server didn't respond anymore):

stuck on smp_invalidate_needed IPI wait (CPU#255)
page fault from irq handler: 0
CPU: 2
EIP: 0010:[<0>]
EFLAGS: 10046
eax: 00000000 ebx: c4122000 ecx: cfc8c000 edx: c021afb0
esi: c1cd0ff4 edi: c4122000 ebp: bfffd59c esp: c4123f58
ds: ??? es: ??? fs: ??? gs: ??? ss: ???
Process ftpd (pid: xxx, process nr: xxx, stackpage=xxx)
Stack: 00000018 00000018 ffffffff c01e9154 00000010 00000202 bffe5000 bfffd59c
c4122000 c48e3000 00000202 09ab6065 c011115f c4122000 c92a9fa0 bfffd59c
00000001 c4122000 00000003 0000000c bfffe568 c92a9fa0 c4122000 c4122000
Call Trace: [<c01e9154>] [<c011115f>] [<c0109f40>] [<c010002b>]
Code:

The call trace looks like follows:

Using `/home/jw5/to_stream/System.map' to map addresses to symbols.

Trace: c01e9154 <stext_lock+b60/4655>
Trace: c011115f <do_page_fault+14b/33c>
Trace: c0109f40 <error_code+30/38>
Trace: c010002b <startup_32+2b/a4>

The last known address is something belonging to handle_mm_fault:

c01e914d: f6 05 24 79 21 testb $0x1,0xc0217924
c01e9152: c0 01
c01e9154: 75 f7 jne c01e914d <__stop_fixup+0xb59>
c01e9156: e9 86 4c f3 ff jmp c011dde1 <handle_mm_fault+0xe9>

***

c011dddd: 85 c0 testl %eax,%eax
c011dddf: 75 0f jne c011ddf0 <handle_mm_fault+0xf8>
c011dde1: f0 0f ba 2d 24 lock btsl $0x0,0xc0217924
c011dde6: 79 21 c0 00
c011ddea: 0f 82 5d b3 0c jb c01e914d <__stop_fixup+0xb59>
c011ddef: 00
c011ddf0: 8b 1e movl (%esi),%ebx

/home/jw5/tmp/linux/linux-2.1.112/include/linux/smp_lock.h:54
1595: f0 0f ba 2d 00 lock btsl $0x0,0x0
159a: 00 00 00 00
159e: 0f 82 fc ff ff jb 15a0 <handle_mm_fault+0xf4>
15a3: ff
/home/jw5/tmp/linux/linux-2.1.112/mm/memory.c:889
15a4: 8b 1e movl (%esi),%ebx

The strange thing is that eip == 0. It seems like someone made a
return from a function and the return address was 0.

And why is the kernel complaining about a page fault within an
interrupt. There is no trace of any interrupt (at least not on this
cpu). Or do I miss something?

The machine in question is an AMI Goliath equipped with 4 PPro and
256MB memory.
The kernel is a stock 2.1.112 (with the lock_kernel patch applied).

/proc/interrupts reports:
3: IO-APIC-edge NE2000
7: IO-APIC-edge Digital DS21140 Tulip
9: IO-APIC-edge ncr53c8xx
10: IO-APIC-edge ncr53c8xx

If I can provide any additional information feel free to ask me.

Jean

PS: the config file:
grep '^C' .config
CONFIG_M686=y
CONFIG_MODULES=y
CONFIG_NET=y
CONFIG_PCI=y
CONFIG_PCI_BIOS=y
CONFIG_SYSVIPC=y
CONFIG_SYSCTL=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_MISC=y
CONFIG_BLK_DEV_FD=y
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_MD=y
CONFIG_MD_LINEAR=y
CONFIG_MD_STRIPED=y
CONFIG_MD_BOOT=y
CONFIG_BLK_DEV_RAM=y
CONFIG_PARIDE_PARPORT=y
CONFIG_PACKET=y
CONFIG_NET_ALIAS=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IP_ALIAS=y
CONFIG_SYN_COOKIES=y
CONFIG_INET_RARP=y
CONFIG_IP_NOSR=y
CONFIG_SKB_LARGE=y
CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=y
CONFIG_BLK_DEV_SR=y
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y
CONFIG_SCSI_BUSLOGIC=y
CONFIG_SCSI_OMIT_FLASHPOINT=y
CONFIG_SCSI_NCR53C8XX=y
CONFIG_SCSI_NCR53C8XX_DEFAULT_TAGS=8
CONFIG_SCSI_NCR53C8XX_MAX_TAGS=4
CONFIG_SCSI_NCR53C8XX_SYNC=5
CONFIG_NETDEVICES=y
CONFIG_DUMMY=y
CONFIG_NET_ETHERNET=y
CONFIG_NET_VENDOR_3COM=y
CONFIG_VORTEX=y
CONFIG_NET_ISA=y
CONFIG_NE2000=y
CONFIG_NET_EISA=y
CONFIG_DE4X5=y
CONFIG_DEC_ELCP=y
CONFIG_MINIX_FS=y
CONFIG_EXT2_FS=y
CONFIG_ISO9660_FS=y
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_PROC_FS=y
CONFIG_NFS_FS=y
CONFIG_SUNRPC=y
CONFIG_LOCKD=y
CONFIG_SMB_FS=y
CONFIG_AUTOFS_FS=y
CONFIG_NLS=y
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_SERIAL=y
CONFIG_VGA_CONSOLE=y
CONFIG_MAGIC_SYSRQ=y

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html