Re: Linux scheduler, overscheduling performance, threads

From: James Manning (jmm@raleigh.ibm.com)
Date: Sat Jan 22 2000 - 21:06:31 EST


[Warning, this is my impression from reading this and other threads]

[ Saturday, January 22, 2000 ] Brian Hurt wrote:
> This discussion is rapidly degenerating into a "does too!" "does not!"
> match. So let me turn the question around. Assume, for a moment, that a
> patch for Linux existed which did two things:
> 1) Ran signifigantly faster for large run queues- for example it switched
> from the current O(n) algorithm to an O(log n) algorithm.
> 2) Ran slower than the current scheduling algorithm for short run
> queues.
> And the patch had no other effects on the kernel at large and was
> otherwise well written, etc.
>
> Am I right in assuming that wether the patch would be accepted would
> depend upon how much slower it made the common case?

With the "common" case, even on many heavily-used servers, being small
run queues, I'm under the impression (from Ingo's previous mail,
Ingo please correct me if I'm wrong) that *any* hit on small (1-3)
run queues will disqualify a patch (that config-time scheduler selection
has some possibilities, though).

> Obviously, if it made the common case no slower, no one would mind
> putting it in the kernel.

I don't think this is necessarily true. Simplicity, readability,
maintainability are all factors, and flexibility may become increasingly
important as things like cluster scheduling start creeping in. Sure,
it'd have an increased likelihood of being accepted, but anything
that becomes a debug or maintenance may well be tossed.

> On the other hand, if it made the common case a million times
> slower, there is no way it'd ever get into the kernel, and rightfully so.
> Where (roughly) inbetween is the breakpoint?

The common case is going to be under 10 for quite sometime. Even
something that maintains low-run-Q performance now may not get accepted
simply because of how it's coded, as it may impact performance in
situations not tested or in future scenarios (like if it's cache
performance were slightly worse, and only future L2 and memory latency
gap increases will cause this new algorithm to be slower, it's
not slower on current machines).

A config option seems like the best compromise (tagged EXPERIMENTAL)
as it can get quickly dispersed and feedback generated on its
improvement (let it kill the typical case as much as it needs to,
only ppl with deep run Q's will enable it, as we'll doc it's worse
performance in low run queu depths in the Configure.help section).

This way you can focus this new patch on being what it should be,
totally optimized for insane numbers of run queue entries, without
any baggage for the typical case (that sounds weird :). Being
a config option (like the popular RAID 0.90 patch) should make it
easier for people to test out the before-and-after (they can patch
the kernel and quickly build w/ and w/o cases, no need to reverse
the patch out to turn it back off)

Who knows, later on this and other such config options could get
grouped into CONFIG_BENCHMARK and this patch will take its place as
CONFIG_BENCHMARK_ULTRASCHEDULER9000 :)

James

-- 
Miscellaneous Engineer --- IBM Netfinity Performance Development

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Jan 23 2000 - 21:00:28 EST