Re: Interesting scheduling times - NOT

Richard Gooch (rgooch@atnf.csiro.au)
Wed, 23 Sep 1998 12:47:38 +1000


Larry McVoy writes:
> : No, your claim is that my testcode is flawed. I have used both pipe
> : and yielding techniques and I get similar variances. You claim that
> : because you don't see the variances and I do, that my testcode is
> : flawed. It doesn't work that way. Just because you don't measure it
> : and I do doesn't mean my test is flawed. Your testing environment may
> : be different than mine.
>
> Unless you are running your test on a multi user system with lots of
> background activity (which would be insane), there is not any
> difference. I run my tests on a machine running X, with a perf
> monitor that updates every second, etc., and I don't see anything
> like what you are seeing.

As I've already said, you're probably not seeing the variance because
you don't run with RT priority. If I run my test with SCHED_OTHER then
I get low variance (a few percent). If I run with SCHED_FIFO, then I
can get high variance (in the median). I've managed to combat most of
that by removing most informative print messages before the benchmark
runs and adding a 0.2 s delay after any informative messages and
before the benchmark and before I go RT. This means that by the time
the benchmark runs, my shell, xterm and X server should be idle again.

Further, I've added code which counts the number of processes on the
run queue just before and after the benchmark. The minimum number of
processes on the run queue is 2 (of course), but the maximum sometimes
got as high as 10!
It was not uncommon to start the benchmark with 2 processes on the run
queue and finish with 10.
With the settling-down delay, I'm now finishing the benchmark with 2
or 3 processes on the run queue (rarely 4).
This has made the variance in the median come down to 10% in the case
where I don't launch log-priority processes to lengthen the run queue.

Where I launch 10 extra processes, the median context switch time can
go from 6.5 us to 11.4 us. In the 6.5 us case, the benchmark typically
finishes with 12 processes on the run queue. In the 11.4 us case, I'm
seeing 14 or 15 processes on the run queue.
The no-extra-processes time is 5.0 us.

Note that the low-priority processes are just forked, so they have COW
page sharing with the benchmark process. The extra 2 or 3 unaccounted
for processes (probably X server, xmeter and such) will of course have
a completely different set of pages and cache lines, so it's not
surprising that their per-process cost is higher than the low-priority
processes.

> : No, again, my benchmark is not flawed. Look, you are trying to do
> : something different with your benchmark. Your focus is to compare
> : between different OSes and to see what the "normal" context switch
> : time is.
>
> It's perfectly fine that you want to do something else. I have no
> problem with your goals but serious problems with your methodology.
> The problem is based an apples to apples comparison: when you run your
> pipe version with no background processes, you should be able to duplicate
> my results very closely. But you can't - you get this huge variance.
> Until that part of your benchmark is fixed, I, for one, am unwilling
> to even consider any other part of your results - I have no reason to
> believe them and a substantial reason not to believe them.

As I've explained before and above, your test doesn't used SCHED_FIFO
and hence you are unlikely to get other processes lengthening your run
queue. This lengthening seems to have a substantial impact on the
variance. When I gave my shell, xterm and X server time to get off the
run queue, the variance went right down.

I'm left with variance (up to 50%) in the long run queue case. I can
sometimes see this variance even with SCHED_OTHER. So there is still
some other effect going on. Again, I don't see a variance this large
with your test, so again there is something that my test is sensitive
too.
Using pipes and token passing doesn't change the variance, BTW.

> : that case seeing cache-induced variance is good, because it can expose
>
> How many times before it sinks in: 77% variance is not cache
> induced. If that were true, then nothing would be deterministic.
> You wouldn't be able to say "time make" and expect to get anything
> like the same number two times in a row, yet people do that all the
> time.

Note that "time make" is timing a much longer operation than timing
context switches. So if some other system event comes along and steals
cycles or pollutes the cache, context switch timing is going to be
more sensitive. This is not to say that it *must* be all cache
induced. It's just one of many possibilities that must be considered.
I did manage to turn off my L2 cache (the L1 cache doesn't seem to
want to turn off), and the cost per process of lengthening the run
queue went from around 0.2 us to 1.2 us. This does show how important
the cache is.

A sensitive test like mine is more likely to expose problems, so it
forces me to explore all the variables until the variability can be
eliminated or at least accounted for.

Instead of abusing me and my test, why not look at the code and try to
figure out *why* we are getting different numbers? That's how science
is done. If your results are different from someone else, you go and
figure out why, not abuse them and their work.
I'm looking at your code to see why it yields less variance. When the
answer is found, I'm sure it will be useful.
If I knew *why* my test is more sensitive, the problem would be
solved.

Regards,

Richard....

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/