Re: Linux 2.6.35.6

From: Florian Mickler
Date: Wed Sep 29 2010 - 03:29:33 EST


On Tue, 28 Sep 2010 15:03:58 -0400
tmhikaru@xxxxxxxxx wrote:

> On Tue, Sep 28, 2010 at 08:35:05AM +0200, Florian Mickler wrote:
> > >
> > > Here's a graphical example of just how wacky this is:
> > >
> > > http://yfrog.com/6lloadbp
> > >
> > > In this image, the dip down to less than 0.5 after the 18'th is due to me
> > > experimenting using the slackware distribution kernel (2.6.33.4) after I
> > > finally noticed something was amiss. The sharp rise afterwards is due to me
> > > first, building 2.6.35.5, and then afterwards, using it. To be perfectly
> > > clear, I've previously used 2.6.34.2 and did not experience the problem
> > > there either, nor is it in 2.6.33.4.
> >
> > What load figure are you basing your observations on? The 15 minutes
> > average should be the most interesting (sampled at a 7 minutes
> > interval...)
>
> my observations are based on letting the machine idle immediately after
> bootup. I monitor the state of the machine using a program called conky,
> which I have configured to show disk I/O, cpu use, swap I/O and among other
> things, the load average. Immediately after booting my loadaverage tends to
> peak at about 2.5 to 3.0; on a working kernel this eventually settles down
> to 0.00 to 0.05 in about ten minutes. On kernels that exhibit this problem,
> it doesn't settle lower than 0.3 and is much more likely to hang anywhere
> from 0.8 to 1.2. In fact, if I give it enough time it'll raise and lower
> itself constantly without any (visible) work being done. So basically I boot
> the machine and go get a drink, come back, and if it's been ten minutes,
> there's been no disk IO, cpu use, or any other activity recorded and it's
> still above 0.3 something's not working right.

Do you know what load average conky is showing you? If I
type 'uptime' on a console, i get three load numbers: 1minute-,
5minutes- and 15minutes-average.
If there is a systematic bias it should be visible on the
15minutes-average. If there are only bursts of 'load' it should be
visible on the 1 minutes average numbers.

But it doesn't really matter for now what kind of load disturbance you
are seeing, because you actually have a better way to distinguish a good
kernel from a bad:

On Mon, 27 Sep 2010 12:32:08 -0400
tmhikaru@xxxxxxxxx wrote:

> *Something* is wrong beyond the
> mere loadaverage numbers going crazy however, since timed runs of kernel
> compiles done with my distro's kernel and 2.6.35.5 show that while there is
> no *apparent* use of cpu or disk showing in vmstat while the machine is
> idle, the compiles on the newer kernel are taking approximately twice as
> long as before.


> If you're talking about the graph,
> I merely posted it to show that I've been having this problem for over a
> month, and it's demonstrably causing very inconsistent load averages. (Which
> is why the graph isn't anything close to a line, it's a mess!) the graph
> takes a reading every five minutes, if you were wondering about the sample
> rate.

Yes, the sample rate was one of the things I wanted to know, but also which of
the 3 load figures you were graphing.

> In other news, I'm in the process of bisection but keep having to skip
> bisects that have compile errors. sigh. still at 12 hops, somewhere around
> five thousands commits to check.

Good.

>
> Tim McGrath
>

Regards,
Flo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/