Re: Decreasing stime running confuses top (was: top displaying 9999% CPU usage)
From: Frans Pop
Date: Wed Oct 03 2007 - 19:33:00 EST
On Wednesday 03 October 2007, you wrote:
> On Wed, Oct 03, 2007 at 09:27:41PM +0200, Frans Pop wrote:
> > On Wednesday 03 October 2007, you wrote:
> > > On Wed, 3 Oct 2007, Ilpo Järvinen wrote:
> > > > On Wed, 3 Oct 2007, Frans Pop wrote:
> > > > > The only change is in 2 consecutive columns: "2911 502" -> "2912
> > > > > 500". Is processor usage calculated from those? Can someone
> > > > > explain how?
> > > >
> > > > The latter seems to be utime ...decreasing. No wonder if
> > > > arithmetics will give strange results (probably top is using
> > > > unsigned delta?)...
> > >
> > > Hmm, minor miscounting from my side, stime seems more appropriate...
> >
> > So, is it normal that stime decreases sometimes or a kernel bug?
> > /me expects the last...
>
> Let me guess... Dual core AMD64 ?
Nope: Intel(R) Pentium(R) D CPU 3.20GHz
> I'm 99.99% sure that if you boot with "notsc", the problem disappears.
Not really. With that first test I did have:
$ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
tsc
If I boot with 'notsc', I get:
cat /sys/devices/system/clocksource/clocksource0/current_clocksource
hpet
But the problem is still exactly the same:
Oct 04 00:53:37 545 92
Oct 04 00:53:38 545 94
Oct 04 00:53:43 546 92 <--
Oct 04 00:53:49 547 94
Oct 04 00:53:54 549 93 <--
Oct 04 00:54:00 550 94
Some relevant lines from kernel log:
checking TSC synchronization [CPU#0 -> CPU#1]: passed <--- Not there with 'notsc'
hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
hpet0: 3 64-bit timers, 14318180 Hz
ACPI: RTC can wake from S4
Time: hpet clocksource has been installed.
hpet_resources: 0xfed00000 is busy
Time: tsc clocksource has been installed. <--- Not there with 'notsc'
> If so, you have one of those wonderful AMD64 with unsynced clock and
> without HPET to sync with. I wrote a simple program in the past to exhibit
> the problem. It would bsimply run "date +%s" in a busy loops and display
> each time it would change. Amazing. It could jump back and forth by up to
> 3 seconds!
I tried your script, but the clock runs perfectly. Never saw anything
other than a 1 second increment.
The following may well be relevant.
With 2.6.22 and early 2.6.23-rc kernels (rc3-rc6) I often had this in my
kernel log (see http://lkml.org/lkml/2007/9/16/45):
checking TSC synchronization [CPU#0 -> CPU#1]:
Measured 248 cycles TSC warp between CPUs, turning off TSC clock.
Marking TSC unstable due to check_tsc_sync_source failed
Some boots the TSC synchronization would be OK, but I'd see ~2/3 failures.
Kernels before 2.6.22 did not have this problem.
However, checking my logs now I see that these messages have disappeared
since 2.6.23-rc7. Now the TSC synchronization check always passes.
I also tried with 2.6.22-6 and with that the jumping around is _not_
present. This was a boot where TSC synchronization failed, so with hpet
as clocksource.
Also, the numbers stay constant much longer and have bigger increments
(updates look to be once per minute?):
Oct 04 01:24:19 465 67
Oct 04 01:24:50 467 69
Oct 04 01:24:51 469 72
Oct 04 01:25:51 474 76
Oct 04 01:26:50 478 80
Cheers,
Frans Pop
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/