Re: More on 2.1.129 oops

Richard Gooch (rgooch@atnf.csiro.au)
Mon, 23 Nov 1998 10:17:38 +1100


Hi, Philipp. Your analysis looks spot on.

> > > - You get two oopses
> >
> > Yes.
> > The first EIP is at c01ccb2b and the second is at c01ccb4c. Below is
> > one page worth of disassembly around those locations.
> >
> > c01ccb29: cd 80 int $0x80
> > c01ccb2b: 39 e6 cmpl %esp,%esi
>
> > c01ccb4a: cd 80 int $0x80
> > c01ccb4c: 39 e6 cmpl %esp,%esi
>
> Very good. Got the problem. I do not know about a real fix, but a good
> workaround would be to remove the "inline" attribute of kernel_thread
> in include/asm-i386/unistd.h . Would be nice if you told me if that
> fixes the problem (it really should).

Well, I was about to do that when I noticed the number of places that
kernel_thread() is called. So instead I made an ugly patch to
init/main.c to define a static function kernel_thread2() and used that
where appropriate (the swapper tasks and initrd, but not the init
thread). Patch appended. Needless to say, it works perfectly. I'll
leave the experimental machine up all day and see how it performs.

Obviously this patch can't go into Linus' kernel since it is x86
specific, but it demonstrates the point.

Linus: do you prefer a solution like the one I've appended, or an
optional semaphore? Personally, I like the semaphore idea, as it's
more flexible.

Regards,

Richard....

--- main.c~ Sun Nov 22 19:26:14 1998
+++ main.c Mon Nov 23 10:07:22 1998
@@ -1200,6 +1200,28 @@

struct task_struct *child_reaper = &init_task;

+static pid_t kernel_thread2(int (*fn)(void *), void * arg, unsigned long flags)
+{
+ long retval;
+
+ __asm__ __volatile__(
+ "movl %%esp,%%esi\n\t"
+ "int $0x80\n\t" /* Linux/i386 system call */
+ "cmpl %%esp,%%esi\n\t" /* child or parent? */
+ "je 1f\n\t" /* parent - jump */
+ "pushl %3\n\t" /* push argument */
+ "call *%4\n\t" /* call fn */
+ "movl %2,%0\n\t" /* exit */
+ "int $0x80\n"
+ "1:\t"
+ :"=a" (retval)
+ :"0" (__NR_clone), "i" (__NR_exit),
+ "r" (arg), "r" (fn),
+ "b" (flags | CLONE_VM)
+ :"si");
+ return retval;
+}
+
/*
* Ok, the machine is now initialized. None of the devices
* have been touched yet, but the CPU subsystem is up and
@@ -1266,16 +1288,16 @@
sock_init();

/* Launch bdflush from here, instead of the old syscall way. */
- kernel_thread(bdflush, NULL, CLONE_FS | CLONE_FILES | CLONE_SIGHAND);
+ kernel_thread2(bdflush, NULL, CLONE_FS | CLONE_FILES | CLONE_SIGHAND);
/* Start the background pageout daemon. */
kswapd_setup();
- kernel_thread(kswapd, NULL, CLONE_FS | CLONE_FILES | CLONE_SIGHAND);
+ kernel_thread2(kswapd, NULL, CLONE_FS | CLONE_FILES | CLONE_SIGHAND);

#if CONFIG_AP1000
/* Start the async paging daemon. */
{
extern int asyncd(void *);
- kernel_thread(asyncd, NULL, CLONE_FS | CLONE_FILES | CLONE_SIGHAND);
+ kernel_thread2(asyncd, NULL, CLONE_FS | CLONE_FILES | CLONE_SIGHAND);
}
#endif

@@ -1321,7 +1343,7 @@
int error;
int i, pid;

- pid = kernel_thread(do_linuxrc, "/linuxrc", SIGCHLD);
+ pid = kernel_thread2(do_linuxrc, "/linuxrc", SIGCHLD);
if (pid>0)
while (pid != wait(&i));
if (MAJOR(real_root_dev) != RAMDISK_MAJOR

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/