Kernel crash in Linux 2.2.12

Ram'on Garc'ia Fern'andez (ramon@jl1.quim.ucm.es)
Mon, 27 Sep 1999 13:51:47 +0200


Hello,

This is a bug report of a complete machine lock, with kernel 2.2.12.
The kernel is compiled for SMP. This is a two processor machine.

The lock usually happens under X, when running Netscape. Usually it
is imposible to get any log, because the machine is under X and kernel
messages are not seen in the screen. No crash has been observed under
lynx.

No log has been seen in /var/log/messages,
the place where syslogd is placing kernel messages in this machine
(Redhat 6.0).

However, once I was in a text
console during a Netscape download, and so I could see kernel panic
messages. There were lots of messages in the screen, so I could not
copy then. I could copy only the last one.

The tulip driver, the point of the crash was loaded as a module.
The map was reconstructed with insmod -m. The tulip driver is
usually loaded via kerneld, when the startup scripts set up the
network interface eth0. So I had to check that I loaded at the same
address where it was loaded when the crash happened. For this purpose,
I checked the tulip symbols in /proc/ksyms from a normal machine
startup, with the symbols from insmod -m.

Since the message was copied by hand, I could have made any mistake.
However, it seems to be consistent. Register values are consistent
with the instruction that crashed and with the address of crash.

Unable to handle NULL pointer reference at virtual address 0x94.

current->tss.cr3 = 0x289e000, %cr3 = [idem]

*pde = 0
Ooops: 0
CPU: 1
EIP: 0x10: 0xC4049b23 (tulip_rx + 515)
EFLAGS= 0x10206
Registers:
EAX = 0
EBX = 0x10
ECX = 0
EDX = 0xC0684b50
ESI = 0xD6
EDI = 0xC0684800
EBP = 0x0
ESP = 0xC3D35D98
DS = ES = SS = 0x18

Process netscape-communicator PID: 16835
Process nr: 37, stackpage = 0xC3DE5000

Stack: 4 0xF0670040 0 0xD6 0xC0604818 0 0xC0684810 0xD6 0 0x1F 0xC4049530
0xC3ED9920 0xC3CDEC20 0x400000 1 4 0xA 0x46 0xC3FD8640 0xC017C6F2 0xC3FD8640
0xC3DE5DF4 0x82 0xC3FD8448 0x82

Call trace: (offsets in decimal)
0xC4049530 (tulip_interrupt + 216)
0xC017C6F2 (start_next_request + 78)
0xC017C98C (ide_intr + 212)
0xC010F3BD (timer_interrupt + 149)
0xC010C2FD (handle_IRQ_event + 85)
0xC0110F0E (do_level_ioapic_IRQ + 98)
0xC010F842 (old_select + 86)
0xC0109FE4 (system_call + 52)

Code: 8B AB 84 00 00 00 8D 04 2E 89 83 84
00 00 00 01 73 5C 3B 83 84 00 00 00
01 73 5C 3B 83

Disassembly: (from tulip.o at tulip_rx + 515)
3ad7: 8b ab 84 00 00 00 movl 0x84(%ebx),%ebp
3add: 8d 04 2e leal (%esi,%ebp,1),%eax
3ae0: 89 83 84 00 00 00 movl %eax,0x84(%ebx)
3ae6: 01 73 5c addl %esi,0x5c(%ebx)
3ae9: 3b 83 88 00 00 00 cmpl 0x88(%ebx),%eax
3aef: 76 13 jbe 3b04 <tulip_rx+0x230>
3af1: 68 04 3b 00 00 pushl $0x3b04
3af6: 8b 4c 24 18 movl 0x18(%esp,1),%ecx
3afa: 51 pushl %ecx
3afb: 53 pushl %ebx
3afc: e8 fc ff ff ff call 3afd <tulip_rx+0x229>
3b01: 83 c4 0c addl $0xc,%esp

According to gdb, the source code is:

skb_put(skb, pkt_len) (tulip.c, line 2406)

actually an inline function. The faulting statement is:
unsigned char *tmp=skb->tail; (include/linux/skbuff.h:462)

Hope this is enough.

Best regards,
Ramon

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/