RE: [ 11/37] sched/nohz: Rewrite and fix load-avg computation -- again

From: Doug Smythies
Date: Sat Jul 21 2012 - 12:03:05 EST

> On 2012.07.20 10:26 -0700, Peter Zijlstra wrote:
>> On Fri, 2012-07-20 at 12:13 -0500, Jonathan Nieder wrote:
>> > Peter Zijlstra wrote:
>> >> On Tue, 2012-07-17 at 19:16 -0500, Jonathan Nieder wrote:
>> >> I'm thrilled to see this regression fix for stable@, but are we really
>> >> really sure that it won't cause new regressions?
>> >
>> > Doug Smythies ran a ~68 hour test on it, running various synthetic
>> > of various frequencies against it and comparing the reported load
>> > averages against the expected values and found it to be 'good'.
>> >
>> > This doesn't guarantee we won't find more 'interesting' problems in
>> > there, but it does give me fair confidence in it.
>> Yeah, that sounds good. Very nice to hear.
>> Is the code to generate the synthetic loads and expected results
>> somewhere easy to find (like LTP or tools/testing) to make it easier
>> to keep this code working well in the future?

> /me finds Doug isn't actually on the CC, /me fixes.


> Doug had this web-page with all his testing activities, graphs and code
> etc..
> Seems to still work.

Those web pages will be there for a long time (years).

> Last time I tried his scripts they weren't very user friendly, and afaik
> he's making the pretty graphs 'manually'. But whatever he's got is there
> I think.

Yes, pretty graphs were manually done.
Yes, scripts lack user friendliness, but everything I used is posted.

> If someone wants to take it and make it pretty and 'usable' for people
> in a hurry I'm sure Doug wouldn't mind.

Someday I might make it more useable myself. Peter's "consume.c" is
very useful also. (I haven't posted it in my web notes yet, but I will.)

The 68 hour test was just one of the tests runs, albeit the main one.
Among the other tests was what I call the "Charles Wang" scenario,
high frequency high loads.

The only operating scenario of potential concern was around higher
loads higher number of processes, where the reported load average
Was a little low and worse than the same conditions without this patch,
although still pretty good (graph attached).

Attachment: freq_5proc.png
Description: PNG image