On Fri, Jan 04, 2002 at 12:44:57PM +0100, Ingo Molnar wrote:
>
> On Fri, 4 Jan 2002, David Lang wrote:
>
> > Ingo,
> > back in the 2.4.4-2.4.5 days when we experimented with the
> > child-runs-first scheduling patch we ran into quite a few programs that
> > died or locked up due to this. (I had a couple myself and heard of others)
>
> hm, Andrea said that the only serious issue was in the sysvinit code,
> which should be fixed in any recent distro. Andrea?
correct. I run child-run-first always on all my boxes. And
those races could trigger also before, so it's better to make them more
easily reproducible anyways.
I always run with this patch applied:
diff -urN parent-timeslice/include/linux/sched.h child-first/include/linux/sched.h
--- parent-timeslice/include/linux/sched.h Thu May 3 18:17:56 2001
+++ child-first/include/linux/sched.h Thu May 3 18:19:44 2001
@@ -301,7 +301,7 @@
* all fields in a single cacheline that are needed for
* the goodness() loop in schedule().
*/
- int counter;
+ volatile int counter;
int nice;
unsigned int policy;
struct mm_struct *mm;
diff -urN parent-timeslice/kernel/fork.c child-first/kernel/fork.c
--- parent-timeslice/kernel/fork.c Thu May 3 18:18:31 2001
+++ child-first/kernel/fork.c Thu May 3 18:20:40 2001
@@ -665,15 +665,18 @@
p->pdeath_signal = 0;
/*
- * "share" dynamic priority between parent and child, thus the
- * total amount of dynamic priorities in the system doesnt change,
- * more scheduling fairness. This is only important in the first
- * timeslice, on the long run the scheduling behaviour is unchanged.
+ * Scheduling the child first is especially useful in avoiding a
+ * lot of copy-on-write faults if the child for a fork() just wants
+ * to do a few simple things and then exec().
*/
- p->counter = (current->counter + 1) >> 1;
- current->counter >>= 1;
- if (!current->counter)
+ {
+ int counter = current->counter;
+ p->counter = (counter + 1) >> 1;
+ current->counter = counter >> 1;
+ p->policy &= ~SCHED_YIELD;
+ current->policy |= SCHED_YIELD;
current->need_resched = 1;
+ }
/* Tell the parent if it can get back its timeslice when child exits */
p->get_child_timeslice = 1;
>
> > try switching this back to the current behaviour and see if the
> > lockups still happen.
>
> there must be some other bug as well, the child-runs-first scheduling can
> cause lockups, but it shouldnt cause oopes.
definitely. My above implementation works fine.
Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Mon Jan 07 2002 - 21:00:25 EST