Re: Unknown HZ value! (2000) Assume 1024.

From: Albert D. Cahalan (acahalan@cs.uml.edu)
Date: Wed May 02 2001 - 01:42:58 EST


> /proc/uptime:
> 4400586.27 150439.36
>
> /proc/stat:
> cpu 371049158 3972370867 8752820 4448994822
> (user, nice, system, idle)
>
> In .../fs/proc/proc_misc.c:kstat_read_proc(), the cpu line is being
> computed by:
>
> len = sprintf(page, "cpu %u %u %u %lu\n", user, nice, system,
> jif * smp_num_cpus - (user + nice + system));

This is pretty bogus. The idle time can run _backwards_ on an SMP
system. What is "top" supposed to do with that, print a negative
number for %idle time? (some versions do, while others truncate
at zero or wrap around to 4 billion -- pick your poison)

> The user, nice, and system values add up to 4352172845 > 2^32, and jif is
> 4400586.27 * 1024 = 4506200340, leading to the incorrect idle time (1
> cpu). It should be calculated this way:
>
> len = sprintf(page, "cpu %u %u %u %lu\n", user, nice, system,
> jif * smp_num_cpus - ((unsigned long)user + nice + system));
>
> or just declare those as unsigned longs instead of ints. I notice also
> that since kstat.per_cpu_nice is an int, it's going to overflow in another
> 3.6 days anyhow. I'll let you know what blows up then. Any chance of
> making those guys longs?

That is good for the Alpha.

For 32-bit systems, we use 32-bit values to reduce overhead.
This causes problems at 495/smp_num_cpus days of uptime.

Proposed hack: set a very-log-duration timer (several days)
to check for the high bit changing. Count bit flips.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon May 07 2001 - 21:00:12 EST