2.1.08 crash (recursion loop)

Michael O'Reilly (michael@metal.iinet.net.au)
13 Jul 1998 09:43:11 +0800


Managed to crash a 108 kernel. (PPro 200Mhz, 256Meg of ram, running
squid, no patches for large FD sets).

The final crash is a page fault, but the interesting bit is futher up
the stack. Stack looked like...

109f88 die_if_kernel
1d2d8b stext_lock
1d3b00 stext_lock
10ec27 do_page_fault
1d3b00 stext_lock
109be4 error_code
116be0 exit_notify
116f67 do_exit
109f88 ....

do_exit -> exit_notify -> page fault -> die_if_kernel -> do_exit .....

which is ... ummm. interesting. :)

Page fault in exit_notify is ....

(gdb) list *0xc0116be0
0xc0116be0 is in exit_notify (exit.c:149).
144 {
145 struct task_struct * p;
146
147 read_lock(&tasklist_lock);
148 for_each_task(p) {
149 if (p->p_opptr == father) {
150 p->exit_signal = SIGCHLD;
151 p->p_opptr = task[smp_num_cpus] ? : task[0]; /* init */
152 if (p->pdeath_signal) send_sig(p->pdeath_signal, p, 0);
153 }

There's another fault previous (which actually got logged):

Jul 12 17:26:42 house kernel: Unable to handle kernel NULL pointer dereference at virtual address 0000
0004
Jul 12 17:26:42 house kernel: current->tss.cr3 = 07b72000, `r3 = 07b72000
Jul 12 17:26:42 house kernel: *pde = 00000000
Jul 12 17:26:42 house kernel: Oops: 0000
Jul 12 17:26:42 house kernel: CPU: 0
Jul 12 17:26:42 house kernel: EIP: 0010:[<c012d93e>]
Jul 12 17:26:42 house kernel: EFLAGS: 00010017
Jul 12 17:26:42 house kernel: eax: 00000000 ebx: c67fd7f4 ecx: cae7b5a4 edx: 00000000
Jul 12 17:26:42 house kernel: esi: 00000287 edi: c67fd7f0 ebp: c785bf6c esp: c785bf38
Jul 12 17:26:42 house kernel: ds: 0018 es: 0018 ss: 0018
Jul 12 17:26:42 house kernel: Process squid (pid: 14151, process nr: 18, stackpage=c785b000)
Jul 12 17:26:42 house kernel: Stack: c680a03c 00000002 000001e2 00000001 c012dc02 c785bf6c c680a280 00
000010
Jul 12 17:26:42 house kernel: c680a2c0 c680a000 c50f5800 c785a000 00000000 0000007f c67fd000 c0
12df61
Jul 12 17:26:42 house kernel: 000001e2 c680a000 01b98875 c785a000 00000000 bffffcdc bffffde4 00
000000
Jul 12 17:26:42 house kernel: Call Trace: [<c012dc02>] [<c012df61>] [<c0109aac>]
Jul 12 17:26:42 house kernel: Code: 8b 42 04 39 d8 75 f7 89 4a 04 56 9d 8b 1f 0f b7 43 1c 48 75

(gdb) list *0xc012d93e
0xc012d93e is in free_wait (/usr/src/linux/include/linux/sched.h:659).
654 struct wait_queue * head = next;
655 struct wait_queue * tmp;
656
657 while ((tmp = head->next) != wait) {
658 head = tmp;
659 }
660 head->next = next;
661 }
662
663 extern inline void remove_wait_queue(struct wait_queue ** p, struct wait_queue * wait)

which is all too common in 108. :(

gdb) list *0xc012dc02
0xc012dc02 is in do_select (select.c:202).
197 current->state = TASK_RUNNING;
198
199 out:
200 if (timeout) {
201 free_wait(&wait_table);
202 free_page((unsigned long) wait_table.entry);
203 }
204 out_nowait:
205 unlock_kernel();
206 return retval;
(gdb) list *0xc012df61
0xc012df61 is in sys_select (select.c:256).
251 goto out;
252 zero_fd_set(n, fds->res_in);
253 zero_fd_set(n, fds->res_out);
254 zero_fd_set(n, fds->res_ex);
255
256 ret = do_select(n, fds, timeout);
257
258 if (tvp && !(current->personality & STICKY_TIMEOUTS)) {
259 unsigned long timeout = current->timeout - jiffies - 1;
260 time_t sec = 0, usec = 0;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html