Re: Interesting scheduling times

Richard Gooch (rgooch@atnf.csiro.au)
Fri, 18 Sep 1998 11:45:50 +1000

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Theodore Y. Ts'o: "Re: STREAMS: interface versus implementation"
Previous message: Andre M. Hedrick: "Re: 2.1.122 is confused with two ide controllers"

Linus Torvalds writes:
>
>
> On Fri, 18 Sep 1998, Richard Gooch wrote:
> >
> > Linus: what do you think of this idea? A valid project for 2.3?
> > I have to say I'm impressed with the soft-RT performance of Linux. In
> > my view the main limitation is the jiffies delay between when an RT
> > process is unblocked and when it starts running.
>
> A single run-queue is almost always better than multiple run-queues, and
> I'm very unlikely to change that.

Agreed that you want to avoid algorithms which scan many run queues.

> The reason for a single run-queue is that it's about 10 times
> simpler than any of the alternatives, and it's never slower in real
> life. Yes, we may end up walking a few more entries, but the
> simplicity more than pays back the cost of that walk.

It seems that just having two queues would still be
simple. Furthermore, having an RT-only run queue would allow us to
have an even simpler scheduler for RT processes, since you don't have
to calculate the weights/goodness. That would gives us even faster
performance for RT scheduling.

Aside: would possibly 3 queue make sense: RT, OTHER and IDLE? Perhaps
that would make scheduling idle (RC5 cracking) processes and the idle
task simpler? Just a thought. I'm not really concerned with that, I'm
interested in better RT performance.

> Even under heavy load, the runqueue is seldom more than a few
> entries deep. More than 10 entries on the run-queue is already very
> rare, and when it does happen the scheduling overhead is very small
> compared to what else the machine is doing: having that many entries
> implies that the scheduler isn't your biggest bottle-neck anyway.

Agreed: I wouldn't expect most systems to have many processes on the
run queue anyway.
Ah, damn. I forgot to include my example of a large system controlling
an instrument as well as supporting users. See below.

> That said, the idea of just having two run-queues, one with
> real-time processes and one without is so far the best
> multi-runqueue idea I've heard. So yes, I could imagine doing
> something like that, but I still don't actually believe that the
> run-queue is the major bottle-neck.

At the very least I'd like to see the wake_up* functions set
need_resched=1 if an RT process is woken up. This way unblocked RT
processes will start running immediately.
If we do go with a special RT run queue, then the wake_up* functions
will need to be changed anyway to put RT processes onto the RT
runqueue.

> PS. Here's the patch to make 2.1.122 perform as it should wrt scheduling,
> and not save the FP register state all the time. Embarrassing.

OK, I'll post some results later.

Finally: the example I was talking about. Consider a large instrument
(a radio telescope (an obvious choice considering where I
work:-)). Typically, the incoming data is fed to a correlator system
(dedicated hardware) and the correlator system output is fed into a
standard computer (CCC: correlator control computer), where the data
is written to disc. At the same time, you want to be able so observe
the data (to check for interference, signals and system
integrity). This requires processing and ideally is able to keep up
with the data rate, although recording data and controlling the
telescope is more important.

The "traditional" solution is to have a dedicated CCC and a separate
dedicated telescope control computer (adjust pointings, read various
monitor points and so on). Finally, you have a third computer to do
online data reduction and visualisation. Unfortunately, this scheme is
rather complex and requires you to ship data between computers.

A good RT system would allow you to put it all on one computer and
data would not need to be shipped around (you'd use threads or
SHM). However, on such a system you will have a considerable number of
(low priority) user processes sitting on the run queue (the latest
generation of data reduction software is very modular and has lots of
processes doing different parts of the work). A dozen processes is not
unreasonable.

To give you an example of what a dozen more processes on the run queue
costs you, I ran my test on a Pentium 100. Without the extra processes
I get about 12 us per context switch. With the extra 12 processes I
get about 40 us per context switch. So that single run queue is going
to hurt.

Again: I'm pretty impressed with the soft-RT performance of Linux, at
least as far as the scheduling overhead is concerned. Something else
to deal with is interrupt latency (in particular how long some drivers
block interrupts). Once that's sorted out, it seems to me Linux can be
classified as hard-RT capable.

Regards,

Richard....

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

Next message: Theodore Y. Ts'o: "Re: STREAMS: interface versus implementation"
Previous message: Andre M. Hedrick: "Re: 2.1.122 is confused with two ide controllers"