Re: [tip:x86/tsc] x86: Improve TSC calibration using a delayedworkqueue

From: Konrad Rzeszutek Wilk
Date: Thu Jan 13 2011 - 12:49:59 EST


On Tue, Jan 11, 2011 at 11:56:40AM +0200, Kirill A. Shutemov wrote:
> On Tue, Jan 11, 2011 at 09:37:15AM +0100, Thomas Gleixner wrote:
> > On Tue, 11 Jan 2011, Kirill A. Shutemov wrote:
> >
> > > On Tue, Jan 11, 2011 at 09:26:48AM +0100, Thomas Gleixner wrote:
> > > > On Tue, 11 Jan 2011, Kirill A. Shutemov wrote:
> > > >
> > > > > On Sun, Dec 05, 2010 at 11:18:53AM +0000, tip-bot for John Stultz wrote:
> > > > > > Commit-ID: 08ec0c58fb8a05d3191d5cb6f5d6f81adb419798
> > > > > > Gitweb: http://git.kernel.org/tip/08ec0c58fb8a05d3191d5cb6f5d6f81adb419798
> > > > > > Author: John Stultz <johnstul@xxxxxxxxxx>
> > > > > > AuthorDate: Tue, 27 Jul 2010 17:00:00 -0700
> > > > > > Committer: John Stultz <john.stultz@xxxxxxxxxx>
> > > > > > CommitDate: Thu, 2 Dec 2010 16:48:37 -0800
> > > > > >
> > > > > > x86: Improve TSC calibration using a delayed workqueue
> > > > >
> > > > > This commit breaks booting the kernel in qemu with enabled KVM on my machine.
> > > > > .config attached.
> > > > >
> > > > > [ 0.424013] divide error: 0000 [#1]
> > > >
> > > > Got fixed by a8760ec (x86: Check tsc available/disabled in the delayed
> > > > init function)
> > >
> > > No, it didn't. :(
> > >
> > > I am able to reproduce it on current Linus' tree (v2.6.37-4700-g8adbf8d).
> >
> > Does the patch below fix it ? We can end up with tsc_khz=0 there :(
>
> Yes, it does.

Interestingly enough, when you run Linux under Xen (as Domain 0) you
get the same stack-trace. With both patches (a8760ec, and the patch
posted earlier) I still get the failure.

I've traced it down to the fact that when we boot under Xen we do
not have the HPET enabled nor the ACPI PM timer setup. The
hpet_enable() is never called (b/c xen_time_init is called), and
for calibration of tsc_khz (calibrate_tsc == xen_tsc_khz) we
get a valid value.

So 'tsc_read_refs' tries to read the ACPI PM timer (acpi_pm_read_early),
however that is disabled under Xen:

[ 1.099272] calling init_acpi_pm_clocksource+0x0/0xdc @ 1
[ 1.140186] PM-Timer failed consistency check (0x0xffffff) - aborting.

So the tsc_calibrate_check gets called, it can't do HPET, and reading
from ACPI PM timer results in getting 0xffffff.. .. and
(0xffff..-0xffff..)/some other value results in div_zero.

There is a check in 'tsc_refine_calibration_work' for invalid
values:

/* hpet or pmtimer available ? */
if (!hpet && !ref_start && !ref_stop)
goto out;

But since ref_start and ref_stop have 0xffffff it does not trigger.

This little fix does it however. Thought it will of course not
recalibrate the tsc - is that a horrible thing? Should we look
at making tsc_read_refs also use the pv-ops in case both hpet and
acpi pm timer are disabled?


Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>

diff --git a/drivers/clocksource/acpi_pm.c b/drivers/clocksource/acpi_pm.c
index cfb0f52..84ff897 100644
--- a/drivers/clocksource/acpi_pm.c
+++ b/drivers/clocksource/acpi_pm.c
@@ -207,6 +208,7 @@ static int __init init_acpi_pm_clocksource(void)
if (i == ACPI_PM_READ_CHECKS) {
printk(KERN_INFO "PM-Timer failed consistency check "
" (0x%#llx) - aborting.\n", value1);
+ pmtmr_ioport = 0;
return -ENODEV;
}
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/