a couple of Oopses from 2.1.42 + Mark Hemment's latest slab patch

gt1355b@prism.gatech.edu
Thu, 5 Jun 1997 00:09:43 +0000 (GMT)


My personal home system is a uniprocessor (yes, I comment out the SMP
line) running stock 2.1.42 + the latest slab patch from Mark Hemment.
W/o the slab patch, the box falls over (with no error logged or on any
screen) with heavy network traffic. With the patch, I've been able to
do some really heavy network stuff, high load, etc. without the box
crashing.

This morning, I came back from class and found the following in the log:

Scheduling in interrupt
Unable to handle kernel NULL pointer dereference at virtual address 00000000
current->tss.cr3 = 01160000, <r3 = 01160000
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0010:[<c010ef2e>]
EFLAGS: 00010286
eax: 00000018 ebx: c10b6000 ecx: c0197844 edx: c1bda000
esi: c0231c54 edi: c10b7db8 ebp: c10b7da4 esp: c10b7d7c
ds: 0018 es: 0018 ss: 0018
Process asmail (pid: 9877, process nr: 32, stackpage=c10b7000)
Stack: c01836c0 00000000 c0231c54 c10b7db8 c0231c54 c10b7db8 c016623b c00980e0
00000286 00000000 00000001 c011adbc c1929000 00085d00 c0231c54 c10b6000
c0231c70 c011f59f c0231c54 c1929000 00085d00 00000000 404b0000 0000003c
Call Trace: [<c01836c0>] [<c016623b>] [<c011adbc>] [<c011f59f>] [<c280a324>] [<c014554f>] [<c0136008>]
[<c011f120>] [<c011fbb9>] [<c011dd18>] [<c011e3a2>] [<c0145588>] [<c0145a8f>] [<c28069b3>] [<c280a324>]
[<c280a324>] [<c28064bf>] [<c280a324>] [<c010a28f>] [<c01094fc>]
Code: c7 05 00 00 00 00 00 00 00 00 8d 65 dc 5b 5e 5f 89 ec 5d c3

Disassembled, thats:

Using `/build/kernel/linux-2.1.42/System.map' to map addresses to symbols.

>>EIP: c010ef2e <schedule+1ee/204>
Trace: c01836c0 <tvecs+14/3f3a>
Trace: c016623b <do_ide0_request+b/10>
Trace: c011adbc <__wait_on_page+7c/b8>
Trace: c011f59f <rw_swap_page+19f/2e0>
Trace: c280a324
Trace: c014554f <alloc_skb+2b/148>
Trace: c0136008 <shm_swap+2bc/2e8>
Trace: c011f120 <try_to_free_page+7c/b4>
Trace: c011fbb9 <__get_free_pages+1ad/200>
Trace: c011dd18 <kmem_cache_grow+100/3d8>
Trace: c011e3a2 <kmalloc+ea/154>
Trace: c0145588 <alloc_skb+64/148>
Trace: c0145a8f <dev_alloc_skb+f/28>
Trace: c28069b3
Trace: c280a324
Trace: c280a324
Trace: c28064bf
Trace: c280a324
Trace: c010a28f <do_IRQ+73/e8>
Trace: c01094fc <ret_from_intr>

Code: c010ef2e <schedule+1ee/204> movl $0x0,0x0
Code: c010ef38 <schedule+1f8/204> leal 0xffffffdc(%ebp),%esp
Code: c010ef3b <schedule+1fb/204> popl %ebx
Code: c010ef3c <schedule+1fc/204> popl %esi
Code: c010ef3d <schedule+1fd/204> popl %edi
Code: c010ef3e <schedule+1fe/204> movl %ebp,%esp
Code: c010ef40 <schedule+200/204> popl %ebp
Code: c010ef41 <schedule+201/204> ret
Code: c010ef42 <schedule+202/204>

The box was completely off the network (TX error timeouts and Socket
Destroy delayed messages scrolling off the screen w/ any attempted
traffic). At the time of the crash, the box was idle except for an
inactive X session.... Before I could do much, the box locked totally
w/ no further messages.

I just got home for the night to find this oops:
kmem_cache_reap() called within int!
Scheduling in interrupt
Unable to handle kernel NULL pointer dereference at virtual address 00000000
current->tss.cr3 = 00a4d000, 8r3 = 00a4d000
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0010:[<c010ef2e>]
EFLAGS: 00010286
eax: 00000018 ebx: c0af8000 ecx: c0197844 edx: c1d50000
esi: c021ede0 edi: c0af9d68 ebp: c0af9d54 esp: c0af9d2c
ds: 0018 es: 0018 ss: 0018
Process netscape (pid: 3721, process nr: 34, stackpage=c0af9000)
Stack: c01836c0 00000000 c021ede0 c0af9d68 c021ede0 c0af9d68 c016623b c00980e0
00000286 00000000 00000001 c011adbc 00086600 00086600 c021ede0 c0af8000
c021edfc c011f59f c021ede0 00086600 c021ede0 00000001 c1358000 c014feaa
Call Trace: [<c01836c0>] [<c016623b>] [<c011adbc>] [<c011f59f>] [<c014feaa>] [<c01500e5>] [<c011ed3d>]
[<c011ef1a>] [<c011f046>] [<c011f139>] [<c011fbb9>] [<c011dd18>] [<c011e3a2>] [<c0145588>] [<c0145a8f>]
[<c28069b3>] [<c280a324>] [<c012c4b4>] [<c28064bf>] [<c280a324>] [<c010a28f>] [<c01094fc>]
Code: c7 05 00 00 00 00 00 00 00 00 8d 65 dc 5b 5e 5f 89 ec 5d c3
Aiee, killing interrupt handler

Disassembled, thats:
Using `/build/kernel/linux-2.1.42/System.map' to map addresses to symbols.

>>EIP: c010ef2e <schedule+1ee/204>
Trace: c01836c0 <tvecs+14/3f3a>
Trace: c016623b <do_ide0_request+b/10>
Trace: c011adbc <__wait_on_page+7c/b8>
Trace: c011f59f <rw_swap_page+19f/2e0>
Trace: c014feaa <ip_output+4a/7c>
Trace: c01500e5 <ip_queue_xmit+199/20c>
Trace: c011ed3d <swap_out_vma+2f1/440>
Trace: c011ef1a <swap_out_process+8e/b8>
Trace: c011f046 <swap_out+102/160>
Trace: c011f139 <try_to_free_page+95/b4>
Trace: c011fbb9 <__get_free_pages+1ad/200>
Trace: c011dd18 <kmem_cache_grow+100/3d8>
Trace: c011e3a2 <kmalloc+ea/154>
Trace: c0145588 <alloc_skb+64/148>
Trace: c0145a8f <dev_alloc_skb+f/28>
Trace: c28069b3
Trace: c280a324
Trace: c012c4b4 <sys_select+314/324>
Trace: c28064bf
Trace: c280a324
Trace: c010a28f <do_IRQ+73/e8>
Trace: c01094fc <ret_from_intr>

Code: c010ef2e <schedule+1ee/204> movl $0x0,0x0
Code: c010ef38 <schedule+1f8/204> leal 0xffffffdc(%ebp),%esp
Code: c010ef3b <schedule+1fb/204> popl %ebx
Code: c010ef3c <schedule+1fc/204> popl %esi
Code: c010ef3d <schedule+1fd/204> popl %edi
Code: c010ef3e <schedule+1fe/204> movl %ebp,%esp
Code: c010ef40 <schedule+200/204> popl %ebp
Code: c010ef41 <schedule+201/204> ret
Code: c010ef42 <schedule+202/204>

Again, the box was essentially idle at the time (I was logged on, but
inactive, in X with Netscape up). Again, the networking was down, with
similar erros (transmit timeouts, socket destroys delayed), and the
machine hung totally almost immediately after starting anything
interactive.

Let me know if you want more info.

thanks,
chris

--
Chris Ricker                                 gt1355b@prism.gatech.edu