[PATCH] new syscall: sys_vfork

Perry Harrington (pedward@sun4.apsoft.com)
Fri, 8 Jan 1999 10:49:54 -0800 (PST)


Hello,

Well, I hacked in support for a traditional style vfork. I haven't
tried actually running an application using the new vfork; I wanted
to release what I have to get feedback, as this is the first patch
I've really done.

Anyhow, some background first:

This implementation of vfork supports these features:

- the VM is cloned off the parent
- the parent sleeps while the vfork()ed child is running
- the parent awakes on an exec() and exit()
- the implementation theoretically allows for recursive vforks
- it's executable from within a cloned thread
- If I'm right about the flags, the sigmask is not cloned

A little bit about the 'controversial' parts: The implementation
uses a wait queue in the task structure. When the parent vforks,
after successful spawning, it sleeps on the vfork wait queue. When
the child exits or execs, it does a wake_up(&current->p_pptr->vfork_sleep);
Which causes the parent to awake. The wakeup in the exec is right
at the top of do_execve(). The wakeup in exit is right before
the time the parent gets notified of the child exit (before notify_parent);

It allows recursion because if a vforked child vforks, it just sleeps,
and as each vforked child performs an exec or exit, it percolates up
through the vfork execution stack.

Please let me know if I've done anything grossly wrong, or just wrong.
Additionally, could someone tell me how to do direct syscalls, I'm fuzzy
on that ;)

--Perry

------------------------------8<-----------------------------------------------

diff -u --recursive linux.vanilla/arch/i386/kernel/entry.S linux/arch/i386/kernel/entry.S
--- linux.vanilla/arch/i386/kernel/entry.S Thu Jan 7 19:21:54 1999
+++ linux/arch/i386/kernel/entry.S Thu Jan 7 20:38:18 1999
@@ -559,13 +559,14 @@
.long SYMBOL_NAME(sys_sendfile)
.long SYMBOL_NAME(sys_ni_syscall) /* streams1 */
.long SYMBOL_NAME(sys_ni_syscall) /* streams2 */
+ .long SYMBOL_NAME(sys_vfork) /* 190 */

/*
- * NOTE!! This doesn' thave to be exact - we just have
+ * NOTE!! This doesn't have to be exact - we just have
* to make sure we have _enough_ of the "sys_ni_syscall"
* entries. Don't panic if you notice that this hasn't
* been shrunk every time we add a new system call.
*/
- .rept NR_syscalls-189
+ .rept NR_syscalls-190
.long SYMBOL_NAME(sys_ni_syscall)
.endr
diff -u --recursive linux.vanilla/arch/i386/kernel/process.c linux/arch/i386/kernel/process.c
--- linux.vanilla/arch/i386/kernel/process.c Thu Jan 7 19:21:54 1999
+++ linux/arch/i386/kernel/process.c Thu Jan 7 20:33:23 1999
@@ -781,6 +781,19 @@
return do_fork(clone_flags, newsp, &regs);
}

+asmlinkage int sys_vfork(struct pt_regs regs)
+{
+ int child;
+
+ child = do_fork(CLONE_VM | SIGCHLD, regs.esp, &regs);
+
+ if (child > 0) {
+ sleep_on(&current->vfork_sleep);
+ }
+
+ return child;
+}
+
/*
* sys_execve() executes a new program.
*/
diff -u --recursive linux.vanilla/fs/exec.c linux/fs/exec.c
--- linux.vanilla/fs/exec.c Sun Nov 15 09:52:27 1998
+++ linux/fs/exec.c Fri Jan 8 10:32:59 1999
@@ -808,6 +808,9 @@
int retval;
int i;

+ /* vfork semantics say wakeup on exec or exit */
+ wake_up(&current->p_pptr->vfork_sleep);
+
bprm.p = PAGE_SIZE*MAX_ARG_PAGES-sizeof(void *);
for (i=0 ; i<MAX_ARG_PAGES ; i++) /* clear page-table */
bprm.page[i] = 0;
diff -u --recursive linux.vanilla/include/linux/sched.h linux/include/linux/sched.h
--- linux.vanilla/include/linux/sched.h Thu Jan 7 19:27:44 1999
+++ linux/include/linux/sched.h Thu Jan 7 21:57:20 1999
@@ -258,6 +258,10 @@
struct task_struct **tarray_ptr;

struct wait_queue *wait_chldexit; /* for wait4() */
+
+/* sleep in vfork parent */
+ struct wait_queue *vfork_sleep;
+
unsigned long policy, rt_priority;
unsigned long it_real_value, it_prof_value, it_virt_value;
unsigned long it_real_incr, it_prof_incr, it_virt_incr;
@@ -298,6 +302,7 @@
struct files_struct *files;
/* memory management info */
struct mm_struct *mm;
+
/* signal handlers */
spinlock_t sigmask_lock; /* Protects signal and blocked */
struct signal_struct *sig;
@@ -349,6 +354,7 @@
/* pidhash */ NULL, NULL, \
/* tarray */ &task[0], \
/* chld wait */ NULL, \
+/* vfork sleep */ NULL, \
/* timeout */ SCHED_OTHER,0,0,0,0,0,0,0, \
/* timer */ { NULL, NULL, 0, 0, it_real_fn }, \
/* utime */ {0,0,0,0},0, \
diff -u --recursive linux.vanilla/kernel/exit.c linux/kernel/exit.c
--- linux.vanilla/kernel/exit.c Tue Nov 24 09:57:10 1998
+++ linux/kernel/exit.c Fri Jan 8 10:34:10 1999
@@ -292,6 +292,10 @@
kill_pg(current->pgrp,SIGHUP,1);
kill_pg(current->pgrp,SIGCONT,1);
}
+
+ /* notify parent sleeping on vfork() */
+ wake_up(&current->p_pptr->vfork_sleep);
+
/* Let father know we died */
notify_parent(current, current->exit_signal);

diff -u --recursive linux.vanilla/kernel/fork.c linux/kernel/fork.c
--- linux.vanilla/kernel/fork.c Thu Jan 7 19:27:29 1999
+++ linux/kernel/fork.c Thu Jan 7 20:24:53 1999
@@ -521,6 +521,7 @@
p->p_pptr = p->p_opptr = current;
p->p_cptr = NULL;
init_waitqueue(&p->wait_chldexit);
+ init_waitqueue(&p->vfork_sleep);

p->sigpending = 0;
sigemptyset(&p->signal);

------------------------------8<-----------------------------------------------

-- 
Perry Harrington       Linux rules all OSes.    APSoft      ()
email: perry@apsoft.com 			Think Blue. /\

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/