Re: BFS vs. mainline scheduler benchmarks and measurements

From: Nikos Chantziaras
Date: Tue Sep 08 2009 - 06:40:58 EST


On 09/08/2009 01:12 PM, Ingo Molnar wrote:

* Nikos Chantziaras<realnc@xxxxxxxx> wrote:

On 09/08/2009 11:04 AM, Ingo Molnar wrote:

* Pekka Pietikainen<pp@xxxxxxxxxx> wrote:

On Mon, Sep 07, 2009 at 10:57:01PM +0200, Ingo Molnar wrote:
Could you profile it please? Also, what's the context-switch rate?

As far as I can tell, the broadcom mips architecture does not have
profiling support. It does only have some proprietary profiling
registers that nobody wrote kernel support for, yet.
Well, what does 'vmstat 1' show - how many context switches are
there per second on the iperf server? In theory if it's a truly
saturated box, there shouldnt be many - just a single iperf task

Yay, finally something that's measurable in this thread \o/

My initial posting in this thread contains 6 separate types of
measurements, rather extensive ones. Out of those, 4 measurements
were latency oriented, two were throughput oriented. Plenty of
data, plenty of results, and very good reproducability.

None of which involve latency-prone GUI applications running on
cheap commodity hardware though. [...]

The lat_tcp, lat_pipe and pipe-test numbers are all benchmarks that
characterise such workloads - they show the latency of context
switches.

I also tested where Con posted numbers that BFS has an edge over
mainline: kbuild performance. Should i not have done that?

It's good that you did, of course. However, when someone reports a problem/issue, the developer usually tries to reproduce the problem; he needs to see what the user sees. This is how it's usually done, not only in most other development environments, but also here from I could gather by reading this list. When getting reports about interactivity issues and with very specific examples of how to reproduce, I would have expected that most developers interested in identifying the issue would try to reproduce the same problem and work from there. That would mean that you (or anyone else with an interest of tracking this down) would follow the examples given (by me and others, like enabling desktop compositing, firing up mplayer with a video and generally reproducing this using the quite detailed steps I posted as a recipe).

However, in this case, instead of the above, raw numbers are posted with batch jobs and benchmarks that aren't actually reproducing the issue as described by the reporter(s). That way, the developer doesn't get to experience the issue firt-hand (and due to this possibly missing the real cause). In most other bug reports or issues, the right thing seems to happen and the devs try to reproduce it exactly as described. But not in this case. I suspect this is due to most devs not using the software components on their machines that are necessary for this and therefore it would take too much time to reproduce the issue exactly as described?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/