Re: BFS vs. mainline scheduler benchmarks and measurements
From: Ingo Molnar
Date: Tue Sep 08 2009 - 06:12:36 EST
* Nikos Chantziaras <realnc@xxxxxxxx> wrote:
> On 09/08/2009 11:04 AM, Ingo Molnar wrote:
>>
>> * Pekka Pietikainen<pp@xxxxxxxxxx> wrote:
>>
>>> On Mon, Sep 07, 2009 at 10:57:01PM +0200, Ingo Molnar wrote:
>>>>>> Could you profile it please? Also, what's the context-switch rate?
>>>>>
>>>>> As far as I can tell, the broadcom mips architecture does not have
>>>>> profiling support. It does only have some proprietary profiling
>>>>> registers that nobody wrote kernel support for, yet.
>>>> Well, what does 'vmstat 1' show - how many context switches are
>>>> there per second on the iperf server? In theory if it's a truly
>>>> saturated box, there shouldnt be many - just a single iperf task
>>>
>>> Yay, finally something that's measurable in this thread \o/
>>
>> My initial posting in this thread contains 6 separate types of
>> measurements, rather extensive ones. Out of those, 4 measurements
>> were latency oriented, two were throughput oriented. Plenty of
>> data, plenty of results, and very good reproducability.
>
> None of which involve latency-prone GUI applications running on
> cheap commodity hardware though. [...]
The lat_tcp, lat_pipe and pipe-test numbers are all benchmarks that
characterise such workloads - they show the latency of context
switches.
I also tested where Con posted numbers that BFS has an edge over
mainline: kbuild performance. Should i not have done that?
Also note the interbench latency measurements that Con posted:
http://ck.kolivas.org/patches/bfs/interbench-bfs-cfs.txt
--- Benchmarking simulated cpu of Audio in the presence of simulated ---
Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met
None 0.004 +/- 0.00436 0.006 100 100
Video 0.008 +/- 0.00879 0.015 100 100
X 0.006 +/- 0.0067 0.014 100 100
Burn 0.005 +/- 0.00563 0.009 100 100
Write 0.005 +/- 0.00887 0.16 100 100
Read 0.006 +/- 0.00696 0.018 100 100
Compile 0.007 +/- 0.00751 0.019 100 100
Versus the mainline scheduler:
--- Benchmarking simulated cpu of Audio in the presence of simulated ---
Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met
None 0.005 +/- 0.00562 0.007 100 100
Video 0.003 +/- 0.00333 0.009 100 100
X 0.003 +/- 0.00409 0.01 100 100
Burn 0.004 +/- 0.00415 0.006 100 100
Write 0.005 +/- 0.00592 0.021 100 100
Read 0.004 +/- 0.00463 0.009 100 100
Compile 0.003 +/- 0.00426 0.014 100 100
look at those standard deviation numbers, their spread is way too
high, often 50% or more - very hard to compare such noisy data.
Furthermore, they happen to show the 2.6.30 mainline scheduler
outperforming BFS in almost every interactivity metric.
Check it for yourself and compare the entries. I havent made those
measurements, Con did.
For example 'Compile' latencies:
--- Benchmarking simulated cpu of Audio in the presence of simulated Load
Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met
v2.6.30: Compile 0.003 +/- 0.00426 0.014 100 100
BFS: Compile 0.007 +/- 0.00751 0.019 100 100
but ... with a near 100% standard deviation that's pretty hard to
judge. The Max Latency went from 14 usecs under v2.6.30 to 19 usecs
on BFS.
> [...] I listed examples where mainline seems to behave
> sub-optimal and ways to reproduce them but this doesn't seem to be
> an area of interest.
It is an area of interest of course. That's how the interactivity
results above became possible.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/