Re: [announce] [patch] Voluntary Kernel Preemption Patch

From: Ingo Molnar
Date: Sat Jul 10 2004 - 07:48:13 EST



* Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:

> > unlike the lowlatency patches, this patch doesn't add a lot of new
> > scheduling points to the source code, it rather reuses a rich but
> > currently inactive set of scheduling points that already exist in the
> > 2.6 tree: the might_sleep() debugging checks. Any code point that does
> > might_sleep() is in fact ready to sleep at that point. So the patch
> > activates these debugging checks to be scheduling points. This reduces
> > complexity and impact quite significantly.
>
> I don't think this is a good idea. Just because a function might
> sleep it doesn't mean it should sleep. I'd rather add the
> might_sleep() to cond_resched() and replace the former with the latter
> in the cases where it makes sense.

think of voluntary preemption as a variant of CONFIG_PREEMPT with
different tradeoffs: it doesnt preempt as much code but it's cheaper (in
terms of code footprint and overhead) and less risky (in terms of code
affected).

What you say is equivalent to: 'because a process has higher priority it
doesnt mean it should be scheduled to', which is the wrong approach
because it is ultimately the decision of the user which tasks get
scheduled (by giving processes various priorities) and the decision of
the scheduler (for freely schedulable tasks). The preemption decision
does not depend and should not depend on the kernel function utilized!

if you dont care about latencies and want to maximize throughput (for
e.g. servers) then you dont want to enable CONFIG_PREEMPT_VOLUNTARY.
That way you get artificial batching of parallel workloads.

FYI, i am also preparing a preemption patch where there's a (per-task)
tunable for 'expected maximum latency' and the kernel would measure
latencies and not do a forced preemption unless this latency is being
exceeded.

Voluntary preemption and CONFIG_PREEMPT means this tunable has a value
of 0 - we reschedule as soon as possible. Server workloads mean a much
higher tolerated latency value in the range of 50 msecs or so. Both are
fair expectations and settings.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/