Q: Getting consistent CPU status rates from /proc/stat

From: Ulrich Windl
Date: Thu Jul 02 2020 - 09:33:50 EST


(Note: I'm not subscribed to the kernel list, so if you want to talk to me, keep me on CC: at least)

Hi!

I wrote a monitoring plugin for the current SLES12 kernel (like 4.12.14-122.26-default) that reads the numbers from /proc/stat building differences from absolute numbers, and then building rates from the differences. For simplicity I used the summary ("cpu") values only.

For example the performance data for a 2-CPU machine could look like this (labels are German):

----
Performancedaten
Bezeichnung Wert Warnung Kritisch
cpu.usr 0.11 - -
cpu.ni 0.03 - -
cpu.sys 0.30 - -
cpu.idl 99.19 - -
cpu.iow 0.12 90.00 100.00
cpu.hirq 0.00 - -
cpu.sirq 0.00 - -
cpu.st 0.04 - -
----

After a while of monitoring I noticed that the 2-CPU machine's values sum up to 100% (i.e.: 1.0). When realizing that a 2-socket. 12 core, 24-threads machine's values sum up to roughly 1200% (i.e.: 12), I wondered:

Shouldn't the sum be either 1 or the number of logical CPUs found in /proc/cpuinfo? While thinking about it, I wondered why the sum for the 2-CPU machine (which is a virtual one, btw) isn't 200% (2).

So my first suspect was that the "cpu" sum counters won't match the sum of individual CPUs' counters. So I wrote a test program.
Actually I found inconsistencies about 1 or 2, so I wondered again: Is that due to a dirty read (although it would be read as one block, I guess)?

So I extended my program to make sure the read is consistent, meaning I get two identical subsequent readings. I's getting even more interesting:

On the 2-CPU machine I needed typically one reading (that is actually two), but the values had differences like this:
mismatch for cpu stat #7: 68318 - 68317 is 1
mismatch for cpu stat #4: 205778 - 205777 is 1
mismatch for cpu stat #0: 236849 - 236848 is 1

(here "stat #0" is the first field after the "cpu#" label)
Some more:
mismatch for cpu stat #7: 68318 - 68317 is 1
mismatch for cpu stat #4: 205778 - 205777 is 1
mismatch for cpu stat #3: 136439482 - 136439481 is 1
mismatch for cpu stat #2: 630025 - 630024 is 1

mismatch for cpu stat #7: 68320 - 68319 is 1
mismatch for cpu stat #6: 5728 - 5727 is 1
mismatch for cpu stat #4: 205779 - 205778 is 1
mismatch for cpu stat #3: 136440999 - 136440998 is 1
mismatch for cpu stat #2: 630056 - 630055 is 1
mismatch for cpu stat #0: 236865 - 236864 is 1

So my guess is that the status is inconsistent anyway. Trying to get a consistent reading on a machine with more CPUs is even more interesting:
On the 2-socket, 12-cores, 24-threads machine my first attempt needed several thousand reads to get one consistent reading, like this:

reading 7637: cpu #5 stat #0 has delta 1
reading 7637: cpu #3 stat #3 has delta 1
reading 7637: cpu #0 stat #3 has delta 2
reading 7637: cpu #0 stat #0 has delta 1
reading 7638: cpu #0 stat #3 has delta 1
reading 7639: cpu #17 stat #3 has delta 1
reading 7639: cpu #0 stat #3 has delta 1
reading 7640: cpu #0 stat #3 has delta 1
reading 7641: cpu #14 stat #3 has delta 1
reading 7641: cpu #0 stat #3 has delta 1
reading 7642: cpu #8 stat #3 has delta 1
reading 7642: cpu #0 stat #3 has delta 1
mismatch for cpu stat #6: 150286 - 150276 is 10
mismatch for cpu stat #4: 233255 - 233242 is 13
mismatch for cpu stat #3: 4588551905 - 4588551892 is 13
mismatch for cpu stat #2: 2078966 - 2078957 is 9
mismatch for cpu stat #1: 95325 - 95316 is 9
mismatch for cpu stat #0: 4371206 - 4371195 is 11

On a 1-socket, 24-cores, 48 threads machine, I aborted the attempt after reading 179828:
reading 179824: cpu #47 stat #0 has delta 1
reading 179824: cpu #33 stat #3 has delta 1
reading 179824: cpu #32 stat #3 has delta 1
reading 179824: cpu #20 stat #3 has delta 1
reading 179824: cpu #15 stat #3 has delta 1
reading 179824: cpu #8 stat #3 has delta 1
reading 179824: cpu #0 stat #3 has delta 5
reading 179825: cpu #11 stat #3 has delta 1
reading 179825: cpu #0 stat #3 has delta 5
reading 179826: cpu #26 stat #3 has delta 1
reading 179826: cpu #14 stat #3 has delta 1
reading 179826: cpu #7 stat #3 has delta 1
reading 179826: cpu #0 stat #3 has delta 5
reading 179827: cpu #48 stat #3 has delta 1
reading 179827: cpu #9 stat #3 has delta 1
reading 179827: cpu #0 stat #3 has delta 5
reading 179828: cpu #47 stat #2 has delta 1
^C

So I kindly ask how to get consistent CPU states information from procfs. I'm attaching a graph that illustrates the problem.
The legend (just using the first eight fields) is:
'#1=.usr' (user cpu), '#2=.ni' (nice cpu), '#3=.sys' (system cpu), '#4=.idl' (idle), '#5=.iow' (I/O wait), '#6=.hirq' (hardware IRQs), '#7=.sirq' (software IRQs), '#8=.st' (stolen)

Is "idle" the only field that is in USER_HZ? If so it makes it hard to be used by a univarsal utility like my monitoring plugin that can read _any_ value from procfs.

For reference, here's the consistency check script I was using:
#!/usr/bin/perl
# Check consistency of CPU states in /proc/stat
# written for SLES12 SP5 and PERL 5.18 by Ulrich Windl

use strict;
use warnings;

use Fcntl qw(:flock SEEK_SET);

my ($rdiff, $rcount, $reading, @readings) = (0, 0, 0, [], []);
my @all_cpu;
if (open(my $fh, '<', my $file = '/proc/stat')) {
do {
my $rref = $readings[$reading];

seek($fh, 0, SEEK_SET) or die "seek() failed SEEK_SET to 0: $!\n";
while (<$fh>) {
my @vals;
if (my ($cpu) = /^cpu(\d*)(?:\s+(\d+)(?{ push(@vals, $^N) }))+$/) {
$cpu = length($cpu) > 0 ? $cpu + 1 : 0;
$rref->[$cpu] = \@vals;
}
}
if (++$rcount > 1) {
my ($old, $new) = ($readings[($reading + 1) % 2],
$readings[$reading]);
$rdiff = 0;
for (my $i = $#$rref; $i >= 0; --$i) {
my ($ocr, $ncr) = ($old->[$i], $new->[$i]);

for (my $j = $#$ncr; $j >= 0; --$j) {
if ((my $delta = $ncr->[$j] - $ocr->[$j]) != 0) {
print "reading $rcount: cpu #$i stat #$j has delta " .
"$delta\n";
$rdiff += abs($delta);
}
}
}
}
$reading = ($reading + 1) % 2;
} while ($rcount < 2 || $rdiff > 0);
@all_cpu = @{$readings[$reading]}; # consistent reading
close($fh);
if ((my $n = scalar(@all_cpu)) >= 2) {
my ($sref, $cref);

$sref = $all_cpu[0];
$cref = $all_cpu[$n] = [ map { 0 } (0 .. $#$sref) ];
for (my $i = 1; $i < $n; ++$i) {
my $iref = $all_cpu[$i];

for (my $j = $#$cref; $j >= 0; --$j) {
$cref->[$j] += $iref->[$j];
}
}
for (my $j = $#$sref; $j >= 0; --$j) {
if ((my $delta = $sref->[$j] - $cref->[$j]) != 0) {
print "mismatch for cpu stat #$j: $sref->[$j] - $cref->[$j] " .
"is $delta\n";
}
}
} else {
print "No CPU?\n";
}
} else {
die "$file: $!\n";
}
--EOF--

Regards,
Ulrich Windl

Attachment: proc-stat-cpu.PNG
Description: PNG image