Re: [patch] don't preempt not TASK_RUNNING tasks

From: Miklos Szeredi
Date: Fri Mar 20 2009 - 06:38:24 EST


On Fri, 20 Mar 2009, Peter Zijlstra wrote:
> On Fri, 2009-03-20 at 10:43 +0100, Miklos Szeredi wrote:
> > Ingo,
> >
> > I tested this one, and I think it makes sense in any case as an
> > optimization. It should also be good for -stable kernels.
> >
> > Does it look OK?
>
> The idea is good, but there is a risk of preemption latencies here. Some
> code paths aren't real quick between setting ->state != TASK_RUNNING and
> calling schedule.
>
> [ Both quick: as in O(1) and few instructions ]
>
> So if we're going to do this, we'd need to audit all such code paths --
> and there be lots.

Oh, yes.

In a random sample the most common pattern is something like this:

spin_lock(&some_lock);
/* do something */
set_task_state(TASK_SOMESLEEP);
/* do something more */
spin_unlock(&some_lock);
schedule();
...

Which should only positively be impacted by the change. But I can
imagine rare cases where it's more complex.

> The first line of attack for this problem is making wait_task_inactive()
> sucks less, which shouldn't be too hard, that unconditional 1 jiffy
> sleep is simply retarded.

I completely agree. However, I'd like to have a non-invasive solution
that can go into current and stable kernels so UML users don't need to
suffer any more.

Thanks,
Miklos

>
> > Index: linux.git/kernel/sched.c
> > ===================================================================
> > --- linux.git.orig/kernel/sched.c 2009-03-20 09:40:47.000000000 +0100
> > +++ linux.git/kernel/sched.c 2009-03-20 10:28:56.000000000 +0100
> > @@ -4632,6 +4632,10 @@ asmlinkage void __sched preempt_schedule
> > if (likely(ti->preempt_count || irqs_disabled()))
> > return;
> >
> > + /* No point in preempting we are just about to go to sleep. */
> > + if (current->state != TASK_RUNNING)
> > + return;
> > +
> > do {
> > add_preempt_count(PREEMPT_ACTIVE);
> > schedule();
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/