Re: More on 2.1.129 oops

Richard Gooch (rgooch@atnf.csiro.au)
Mon, 23 Nov 1998 11:42:45 +1100


Philipp Rumpf writes:
> On Mon, Nov 23, 1998 at 10:17:38AM +1100, Richard Gooch wrote:
> > Hi, Philipp. Your analysis looks spot on.
> >
> > Obviously this patch can't go into Linus' kernel since it is x86
> > specific, but it demonstrates the point.
>
> > Linus: do you prefer a solution like the one I've appended, or an
> > optional semaphore? Personally, I like the semaphore idea, as it's
> > more flexible.
>
> We had to decrement the semaphore (actually, it was an usecount for
> this region of code) at the beginning of bdflush and kflushd, which
> would not be very flexible.

What are you talking about? Do you mean that you've made some other
changes?

> I think some wrapper function in the non-init sections would be the
> thing to do, as it could be architecture-independent and could be
> useful as it allowed to let the functions now called by
> kernel_thread to return (look into drivers/scsi/host.c for an
> example).

Fair enough.

> > - pid = kernel_thread(do_linuxrc, "/linuxrc", SIGCHLD);
> > + pid = kernel_thread2(do_linuxrc, "/linuxrc", SIGCHLD);
>
> This is not necessary, as we wait for the thread to finish anyway.
>
> > if (pid>0)
> > while (pid != wait(&i));

Yeah, I didn't look that closely.

Sigh. After an hour of uptime, I noticed more oopses on this
machine. Kernel messages and ksymoops appended. They occur in the
route process which is started up from cron.

Regards,

Richard....

Linux version 2.1.129 (rgooch@workaholix) (gcc version 2.7.2.f.1) #15 Mon Nov 23 10:07:47 EST 1998
Console: colour VGA+ 80x60
Calibrating delay loop... 99.94 BogoMIPS
Memory: 28440k/32768k available (728k kernel code, 408k reserved, 600k data, 48k init)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
CPU: Cyrix 6x86 2x Core/Bus Clock stepping 2.5
Checking 386/387 coupling... OK, FPU using exception 16 error reporting.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
PCI: PCI BIOS revision 2.10 entry at 0xfb260
PCI: Probing PCI hardware
Swansea University Computer Society NET3.039 for Linux 2.1
NET3: Unix domain sockets 0.16 for Linux NET3.038.
Swansea University Computer Society TCP/IP for NET3.037
IP Protocols: ICMP, UDP, TCP
Starting kswapd v 1.5
Serial driver version 4.26 with SHARE_IRQ enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS01 at 0x02f8 (irq = 3) is a 16550A
Real Time Clock Driver v1.09
RAM disk driver initialized: 16 RAM disks of 8192K size
scsi : 0 hosts.
scsi : detected total.
RAMDISK: Compressed image found at block 0
VFS: Mounted root (ext2 filesystem).
Freeing unused kernel memory: 48k freed
ne.c:v1.10 9/23/94 Donald Becker (becker@cesdis.gsfc.nasa.gov)
NE*000 ethercard probe at 0x280: 00 40 33 22 95 14
eth0: NE2000 found at 0x280, using IRQ 10.
Unable to handle kernel paging request at virtual address 00100184
current->tss.cr3 = 00e0d000, %cr3 = 00e0d000
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<00100184>]
EFLAGS: 00010286
eax: 00ff0000 ebx: c0dae000 ecx: 00100184 edx: 00000018
esi: c0daffb0 edi: 40007000 ebp: bffffe9c esp: c0dafef4
ds: 0018 es: 0018 ss: 0018
Process route (pid: 84, process nr: 16, stackpage=c0daf000)
Stack: 00ff0000 c0dae000 00100084 00000018 00da00b0 c000cb00 ffff009c 8000cb00
c0000018 00000018 ffffffff 00100084 00000010 00010286 c0107cad c0daff38
c0000000 c0dae000 00100784 00000018 00da00b0 00000000 c0ffff9c 40ff7000
Call Trace: [<c0107cad>] [<c0107cad>] [<c0107cad>]
Code: <1>Unable to handle kernel paging request at virtual address 00100184
current->tss.cr3 = 00e0d000, %cr3 = 00e0d000
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c0108077>]
EFLAGS: 00010086
eax: 00100184 ebx: 00000000 ecx: 00000009 edx: c0f26000
esi: 0000022b edi: c0db0000 ebp: c2800000 esp: c0dafe30
ds: 0018 es: 0018 ss: 0018
Process route (pid: 84, process nr: 16, stackpage=c0daf000)
Stack: c0dafeb8 c0d846e0 c01d8ce6 c0daffb0 40007000 bffffe9c 00ff0000 c0dae000
00100184 00000018 00100184 00010286 c2800000 c3000000 c01080d8 c0dafeb8
c019f83c c01a075c 00000000 00000000 c010ce92 c01a075c c0dafeb8 00000000
Call Trace: [<c2800000>] [<c3000000>] [<c01080d8>] [<c019f83c>] [<c01a075c>] [<c010ce92>] [<c01a075c>]
[<c0107cad>] [<c0107cad>] [<c0107cad>] [<c0107cad>] [<c0107cad>]
Code: 0f b6 0c 03 89 4c 24 38 51 68 34 f8 19 c0 e8 42 93 00 00 83
Unable to handle kernel paging request at virtual address e0e02120
current->tss.cr3 = 00101000, %cr3 = 00101000
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c0117a0c>]
EFLAGS: 00010206
eax: 0004e000 ebx: 00000006 ecx: c0e0d080 edx: 00048000
esi: e0e02120 edi: 00048000 ebp: 00000000 esp: c0dafd28
ds: 0018 es: 0018 ss: 0018
Process route (pid: 84, process nr: 16, stackpage=c0daf000)
Stack: 00006000 c0d846e0 c0e0d080 c016a38a 00000000 0004e000 00000000 0804e000
c0e0d080 c011931e c0d846e0 08048000 00006000 c0d846e0 c0dae000 c0dafdf4
c0d846e0 c0f283a0 c011061a c0d846e0 c0d846e0 c0d846e0 c0114c45 c0d846e0
Call Trace: [<c016a38a>] [<c011931e>] [<c011061a>] [<c0114c45>] [<c01080db>] [<c01080e0>] [<c019f83c>]
[<c01a075c>] [<c010ce92>] [<c01a075c>] [<c2800000>] [<c0107cad>] [<c2800000>] [<c0108077>] [<c0100010>]
[<c2800000>] [<c3000000>] [<c01080d8>] [<c019f83c>] [<c01a075c>] [<c010ce92>] [<c01a075c>] [<c0107cad>]
[<c0107cad>] [<c0107cad>] [<c0107cad>] [<c0107cad>]
Code: 8b 06 83 c6 04 4b 85 c0 74 f2 c7 46 fc 00 00 00 00 a8 03 74
===============================================================================
Using `../System.map' to map addresses to symbols.

>>EIP: 100184 cannot be resolved
Trace: c0107cad <error_code+2d/40>
Trace: c0107cad <error_code+2d/40>
Trace: c0107cad <error_code+2d/40>
Code:
>>EIP: c0108077 <show_registers+26b/29c>
Trace: c2800000
Trace: c3000000
Trace: c01080d8 <die+30/38>
Trace: c019f83c <stext_lock+e9c/1e58>
Trace: c01a075c <stext_lock+1dbc/1e58>
Trace: c010ce92 <do_page_fault+30e/364>
Trace: c01a075c <stext_lock+1dbc/1e58>
Trace: c0107cad <error_code+2d/40>
Trace: c0107cad <error_code+2d/40>
Trace: c0107cad <error_code+2d/40>
Trace: c0107cad <error_code+2d/40>
Trace: c0107cad <error_code+2d/40>
Code: c0108077 <show_registers+26b/29c>
Code: c0108077 <show_registers+26b/29c> 0f b6 0c 03 movzbl (%ebx,%eax,1),%ecx
Code: c010807b <show_registers+26f/29c> 89 4c 24 38 movl %ecx,0x38(%esp,1)
Code: c010807f <show_registers+273/29c> 51 pushl %ecx
Code: c0108080 <show_registers+274/29c> 68 34 f8 19 c0 pushl $0xc019f834
Code: c0108085 <show_registers+279/29c> e8 42 93 00 00 call c01113cc <printk>
Code: c010808a <show_registers+27e/29c> 83 00 90 addl $0xffffff90,(%eax)
Code: c010808d <show_registers+281/29c> 90 nop
Code: c010808e <show_registers+282/29c> 90 nop
>>EIP: c0117a0c <zap_page_range+f8/1bc>
Trace: c016a38a <unblank_screen+9e/a4>
Trace: c011931e <exit_mmap+aa/12c>
Trace: c011061a <mmput+1e/38>
Trace: c0114c45 <do_exit+b9/220>
Trace: c01080db <die+33/38>
Trace: c01080e0 <die_if_no_fixup>
Trace: c019f83c <stext_lock+e9c/1e58>
Trace: c01a075c <stext_lock+1dbc/1e58>
Trace: c010ce92 <do_page_fault+30e/364>
Trace: c01a075c <stext_lock+1dbc/1e58>
Trace: c2800000
Trace: c0107cad <error_code+2d/40>
Trace: c2800000
Trace: c0108077 <show_registers+26b/29c>
Trace: c0100010 <startup_32+10/11e>
Trace: c2800000
Trace: c3000000
Trace: c01080d8 <die+30/38>
Trace: c019f83c <stext_lock+e9c/1e58>
Trace: c01a075c <stext_lock+1dbc/1e58>
Trace: c010ce92 <do_page_fault+30e/364>
Trace: c01a075c <stext_lock+1dbc/1e58>
Trace: c0107cad <error_code+2d/40>
Trace: c0107cad <error_code+2d/40>
Trace: c0107cad <error_code+2d/40>
Trace: c0107cad <error_code+2d/40>
Trace: c0107cad <error_code+2d/40>
Code: c0117a0c <zap_page_range+f8/1bc>
Code: c0117a0c <zap_page_range+f8/1bc> 8b 06 movl (%esi),%eax
Code: c0117a0e <zap_page_range+fa/1bc> 83 c6 04 addl $0x4,%esi
Code: c0117a11 <zap_page_range+fd/1bc> 4b decl %ebx
Code: c0117a12 <zap_page_range+fe/1bc> 85 c0 testl %eax,%eax
Code: c0117a14 <zap_page_range+100/1bc> 74 f2 je fffffffc <_EIP+0xfffffffc>
Code: c0117a16 <zap_page_range+102/1bc> c7 46 fc 00 00 movl $0x0,0xfffffffc(%esi)
Code: c0117a1d <zap_page_range+109/1bc> a8 03 testb $0x3,%al
Code: c0117a1f <zap_page_range+10b/1bc> 74 00 je c0117a21 <zap_page_range+10d/1bc>
Code: c0117a21 <zap_page_range+10d/1bc> 90 nop
Code: c0117a22 <zap_page_range+10e/1bc> 90 nop
Code: c0117a23 <zap_page_range+10f/1bc> 90 nop

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/