Re: [patch] sched: fix scheduling latencies for !PREEMPT kernels

From: Andrea Arcangeli
Date: Tue Sep 14 2004 - 10:12:49 EST


On Wed, Sep 15, 2004 at 12:28:49AM +1000, Nick Piggin wrote:
> Andrea Arcangeli wrote:
> >On Tue, Sep 14, 2004 at 11:33:48PM +1000, Nick Piggin wrote:
> >
> >>cond_rescheds everywhere? Isn't this now the worst of both worlds?
> >
> >
> >1) cond_resched should become a noop if CONFIG_PREEMPT=y
> > (cond_resched_lock of course should still unlock/relock if
> > need_resched() is set, but not __cond_resched).
>
> Unfortunately we need to keep the cond_rescheds that are called under
> the bkl. Otherwise yes, this would be nice to be able to do.

we simply need a cond_resched_bkl() for that, no? Very few places are
still serialized with the BKL, so I don't think it would be a big issue
to convert those few places to use cond_resched_bkl.

> Which is why we don't need more of them ;)

eheh ;)

> Well I don't know how good an argument the crashes one is these days,
> but generally (as far as I know) those who really care about latency
> shouldn't mind about some extra overheads.

sure, that's especially true for the hardirq and softirq total scheduler
offloading. The real question is where a generic desktop positions. I
doubt on a generic desktop a latency over 1msec matters much,
top performance of repetitive tasks that sums up like hardirqs for a NIC
sounds more sensible to me.

And for the other usages RTAI or any other hard realtime sounds safer
anyways.

> Now I don't disagree with some cond_rescheds for places where !PREEMPT
> latency would otherwise be massive, but cases like doing cond_resched
> for every page in the scanner is something that you could say is imposing
> overhead on people who *don't* want it (ie !PREEMPT).

definitely. Note that this could be simply fixed by having a
CONFIG_PREEMPT around it, but the real fix is definitely to make
cond_resched a noop with PREEMPT=y and secondly to add a
cond_resched_bkl defined as the current cond_resched as suggested above.

> :) I don't think Ingo intended this for merging as is. Maybe it is to
> test how much progress he has made.

I hope so, he said the latest patches were cleaned up and he removed the
debugging/performance cruft (the most explicit message was
20040719115952.GA13564@xxxxxxx) but it's still not clear to me if he
left the CONFIG_VOLOUNTARY_PREEMPT config option because he intends to
merge it or just temporarily. Comments?

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/