Re: Interesting scheduling times - NOT

Richard Gooch (rgooch@atnf.csiro.au)
Fri, 25 Sep 1998 11:01:46 +1000


Larry McVoy writes:
> Richard Gooch <rgooch@atnf.csiro.au>:
> : So are you still convinced that my test is broken (even
> : though neither you, Linus or anyone else has shown *what* is broken)?
>
> Yes.

The evidence (variation in lmbench results) is not in your favour.

> : There are at least two separate discussions going on here. One is
> : about the cost of extra processes on the run queue. I've demonstrated
> : similar numbers for this with lmbench and my code. It see we have
> : provisional agreement on this cost. The discussion now is whether the
> : cost is significant.
>
> Which has been my other point all along. I don't think you have a
> system that is well tuned enough that you are going to notice the extra
> 2us per context switch. And if you do have such a system, I claim that
> the author of that system could trivially modify the application to get
> rid of that overhead.

The application I monitored (up to 10 processes on the run queue) is
part of the latest generation astronomical data reduction software
currently being developed by an international consortium of radio
astronomy institutions. This software is highly modular and has a
central co-ordination process which communicates with "agents"
(processes) to perform most of the work.
IIRC over 50 man-years has already been put into this project. It is
far too late to change something this fundamental to the design.
BTW: it's not *my* design. I'm just telling you what is out there.

There are a few things in Linux which would prevent a happy
co-existence between an RT application and this data reduction
application. One of those things is run queue length costs in RT
wakeup latencies. That is what I'm addressing at the moment.
Another thing is drivers which globally disable interrupts for too
long (8390). This is easy to solve: don't use those drivers or rip
out/reduce the interrupt disabling.

> : The other discussion regards the variance I have measured. Here you
> : persist in saying my test is flawed, and yet so far not a single one
> : of your suggestions as to why it may be flawed has stood up to the
> : test.
>
> Yo, Richard. Go back and find a posting that starts out "Hmm, I've
> debugged Richard's code and here's why it varies". You won't because
> there isn't one. I'm not saying I know what the cause of your problems
> is and I don't really care what the cause of your problems is. I am
> saying that your benchmark results are meaningless given the variance
> and no explanation of the variance. That's all. Focus.

I think you need the focus. The variance is pointing to something
interesting/unexpected going on. Should I consider lmbench results
meaningless too, because they also yield variance?

> : Now, I agree in principle that you should fix broken applications
> : rather than fix the OS. But it doesn't always work this way.
>
> Yes it does. That's the difference between Linux and BloatOS. We
> don't have to cater to your application, you aren't paying us to do
> so. So unless you can prove that it is worthwhile to add more code
> to the kernel, it doesn't happen.

It will add a small amount of code in a couple of places and will
simplify the code in other places.

> : No, I'm not asking you to debug it. All I ask is that you don't feel
> : like being constructive, at least don't be destructive. Constructive
> : criticism is welcome. Finger pointing, abuse and agression are not.
>
> Come on, Richard. Do you want there to be no standards for kernel
> hackers? What do you suggest we do when people show up with no
> experience and want to check in their favorite thing to the kernel?
> I'm sorry, but the answer isn't "that's nice, have fun". If you
> can't stand the heat...

If you had not been abusive and aggressive, I would not be bringing
you to book. I don't have a problem with people disagreeing with me. I
may or may not end up agreeing with their point of view.

I don't accept people being rude or obnoxious. There is absolutely no
justification for labelling someones code "broken" without even the
basic decency to look at their code and point out what is wrong with
it.

Again: if you don't agree with my results, just say so simply and
politely.
Don't be abusive.
Don't be aggressive.
Don't be impolite.
Don't use emotive language.

> You have one fatal flaw in your logic: you think my arguments are
> not constructive because they aren't helping /you/. I could care
> less about you, you aren't the point. This isn't the
> richard-gooch-helpers list, this is the linux-kernel list. The
> focus here is what is good for the linux kernel, not anything else.

This is a list where people can and should discuss ideas and put forth
results. This is also a place where people can (and many times have)
presented a problem or a result and asked for help. Asking is not
demanding. I didn't put my results up and say "I require people to
analyse my code".

I'll say it again. Maybe it will eventually penetrate: I don't mind if
you disagree with me. Just keep it polite and respectful.

> Think about that and reconsider my position - makes a little more
> sense, no?

No, your position makes no sense. Your problem is:
You are aggressive.
You are impolite.
You are abusive.
You use emotive language.
You denegrate the work of others.

I don't know why you behave like this, but the linux kernel list isn't
the place for it.

> : Unexpected results don't automatically invalidate the results.
>
> s/Unexpect/Unexpect and unexplained/ and the statement is absolutely correct
> in the engineering world. You have to be able to explain your results.

Ultimately, yes. But until the problem cause(s) is exposed, it remains
unexplained.

> : I want to expose the problem and track it down.
>
> Great! That's what I've been asking you to do since message #1. So
> go do it.

And I have been doing that and have been telling you for days that I
have.

> : BTW: your comment about "deferring" betrays an underlying arrogance.
>
> Nah, it shows that I think I've done a hell of lot more homework
> than you have.

The variance in the lmbench results show you've missed something.
It also shows that you lack tact.

> : So, if you're not willing to be constructive, or willing to admit that
> : your code may be flawed because it's *insensitive* to the effect(s),
> : just go away and let me get on with tracking down the problem.
>
> Richard, it's you that insists on having this public argument and
> debugging session. When Linus tried to take it private, and I
> concurred, you dragged it out on the list again. If you want to
> have a public argument, you aren't going to get anywhere by telling
> people that disagree with you to go away. You're going to get
> somewhere by doing your homework and showing us the right answer.

At least one of the (two?) messages Linus sent privately also appeared
on the list, so I assumed my duplicate filtering had fallen over (the
database has been filling up rather quickly the last couple of weeks).
My mistake, apparently.

I have not told you to go away because you disagree. I've told you to
stop the mantra that by benchmark is "broken" simply because I've
noticed variance and you haven't.
If you can't be polite, go away.

> I'm sure you can do it, you're smart, so go do it.

Gee, thanks.

Regards,

Richard....

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/