Re: [2.6.25-rc1] Strange regression with CONFIG_HZ_300=y

From: Eric Piel
Date: Tue Feb 12 2008 - 03:52:58 EST


Carlos R. Mafra wrote:
I apologize in advance if I am crazy about this, but I noticed
a strange regression wrt 2.6.24 in cpufreq (I think) in 2.6.25-rc1, which
goes away if I revert the following commit:

commit bdc807871d58285737d50dc6163d0feb72cb0dc2
Author: H. Peter Anvin <hpa@xxxxxxxxx>
Date: Fri Feb 8 04:21:26 2008 -0800

avoid overflows in kernel/time.c

When the conversion factor between jiffies and milli- or microseconds is
not a single multiply or divide, as for the case of HZ == 300, we currently
do a multiply followed by a divide. The intervening result, however, is
subject to overflows, especially since the fraction is not simplified (for
HZ == 300, we multiply by 300 and divide by 1000).

This is exposed to the user when passing a large timeout to poll(), for
example.

This patch replaces the multiply-divide with a reciprocal multiplication on
32-bit platforms. When the input is an unsigned long, there is no portable
way to do this on 64-bit platforms there is no portable way to do this
since it requires a 128-bit intermediate result (which gcc does support on
64-bit platforms but may generate libgcc calls, e.g. on 64-bit s390), but
since the output is a 32-bit integer in the cases affected, just simplify
the multiply-divide (*3/10 instead of *300/1000).

The reciprocal multiply used can have off-by-one errors in the upper half
of the valid output range. This could be avoided at the expense of having
to deal with a potential 65-bit intermediate result. Since the intent is
to avoid overflow problems and most of the other time conversions are only
semiexact, the off-by-one errors were considered an acceptable tradeoff.

[...]
[more text follows]

The problem in vanilla 2.6.25-rc1 happens with CONFIG_HZ_300=y (and doesn't
with CONFIG_HZ_250=y or with the above commit reverted). The cpu frequency doesn't
change anymore regardless of the load, and it stays high (2.0 GHz or 1.2 GHz) even
when idle (I checked with 'top'), when the usual is to go to 800 Mhz when idle (I
always use the ondemand governor compiled in and as the default governor).

The laptop is a Vaio VGN-FZ240E, core 2 duo T7250 @ 2.0 GHz and the kernel is x86_64.

Hi, it's great you found out the culprit commit because I was really wondering where this bug was coming from...
As a data point, my machine has a core 2 duo @ 1.2GHz and x86_64 arch. Do you also have the tickless option activated? (it could play a role)

See you,
Eric
begin:vcard
fn;quoted-printable:=C3=89ric Piel
n;quoted-printable:Piel;=C3=89ric
org:Technical University of Delft;Software Engineering Research Group
adr:HB 08.080;;Mekelweg 4;Delft;;2628 CD;The Netherlands
email;internet:E.A.B.Piel@xxxxxxxxxx
tel;work:+31 15 278 6338
x-mozilla-html:FALSE
url:http://pieleric.free.fr
version:2.1
end:vcard