Re: [PATCH] scheduler patch, faster still

Richard Gooch (rgooch@atnf.csiro.au)
Mon, 28 Sep 1998 11:04:19 +1000


yodaiken@chelm.cs.nmt.edu writes:
> On Mon, Sep 28, 1998 at 09:52:52AM +1000, Richard Gooch wrote:
> > Rik van Riel writes:
> > > Hi Richard, Larry,
> > >
> > > Here is a second version of my scheduler patch. It
> > > should improve RT latency a little bit and remove
>
> Here is the difficulty with this discussion:
> No documented problem
> No reason to believe that the "solution" solves the unspecified "problem"
> No clear understanding of the effects of this change on the system
>
> Why, this could be a discusion among AIX developers!
>
> Richard points out a strange result shown by his benchmark. He
> _guesses_ that this result implies that (A) there is a performance
> problem for real applications and (B) this problem may be solved by
> a second queue. But what is the evidence for either? Note that a
> queue of length 10 is trivial, a queue of length 100 is most likely
> trivial on a fast processor. How about a benchmark with an
> instrumented kernel showing time spent searching the queue before we
> optimize time searching the queue? That is, you should be able to
> show that for Application X, wall clock time of T involves T_s time
> searching the queue where T_s is a substantial fraction of T.

A: I've explained previously that we have latency sensitive RT tasks
sitting in a tight read-block/do-something-simple/write/read-again
loop.

B: The evidence for a second run queue solving the problem is very
strong. The numbers I've posted show beyond any doubt that an
increased run queue length has a measurable effect on context switch
latency. I've posted figures showing the per-process cost of extra
entries on the run queue.
My published results show that a queue length of 10 is in fact not
trivial. A queue length of 100 is not trivial even on a large
processor. A queue length of 3 (extra) is already significant on a
Pentium 100: the thread switch time is doubled. 10 extra processes
(possible with one of our applications) is horrendous.

The third issue you raised above (T_s is significant compared to T) is
covered by my response to (A).

> As it is, we have no solid evidence that there is anything wrong
> with the scheduler. In fact, we have DaveM's experience (not to
> mention Cort's experience with PPC Linux) to show that hammering out
> a few microseconds on schedule time seems to have NO effect on
> system performance.

I'm not talking about overall system performance. I'm talking about
the effect of increased latency on latency-sensitive RT
applications. This is a different world.

Also, a separate RT run queue would make the scheduler code simpler
and more robust.

> It follows that those who really want to contribute to Linux
> performance might consult Alan's list of 2.2 showstoppers or find
> some other actual problem. I have a couple of scheduler related
> problems for RTLinux that I would love someone to solve. For
> example, it should be easy to write a SMP scheduler that would evict
> all non-rt tasks from a particular processor after at most N timer
> interrupts where N is the number of Linux processes assigned to that
> CPU. Or maybe someone could rewrite our RT-fifo module for better
> performance. Or how about a solid big physarea? ..

Go back to an early post of mine where I suggest a separate RT run
queue as a possible 2.3 project. I didn't say "let's do this for 2.2".

I would rather spend my efforts improving standard Linux than
RT-Linux. No offence intended.

Regards,

Richard....

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/