Re: [RFC patch 0/4] TSC calibration improvements

From: Thomas Gleixner
Date: Sat Sep 06 2008 - 16:03:41 EST


On Fri, 5 Sep 2008, Linus Torvalds wrote:
> On Fri, 5 Sep 2008, Alok Kataria wrote:
> >
> > This can happen if, in the pit_expect_msb (the one just before the
> > second read_tsc), we hit an SMI/virtualization event *after* doing the
> > 50 iterations of PIT read loop, this allows the pit_expect_msb to
> > succeed when the SMI returns.
>
> So theoretically, on real hardware, the minimum of 50 reads will take
> 100us. The 256 PTI cycles will take 214us, so in the absolute worst case,
> you can have two consecutive successful cycles despite having a 228us SMI
> (or other event) if it happens just in the middle.

I just ran your patch in a loop on the SMI mephitic laptop with random
delays between the loops.

20 out of 10000 results are between 2200 and 8400Mhz while the CPU
still runs @2GHz.

And it also happened in an automated boot/reboot cycle (no extra loop)
once out of 50.

So the SMI hit exactly:

+ t1 = get_cycles();
+ for (expect = 0xfe, i = 0; i < QUICK_PIT_ITERATIONS; i++, expect--) {
+ if (!pit_expect_msb(expect))
+ goto failed;
+ }

-----> HERE

+ t2 = get_cycles();

The SMIs on that machine take between 500us and 100+ms according to my
instrumentation experiments: http://lkml.org/lkml/2008/9/2/202

Adding another check after the second get_cycles() makes it reliable:

+ t1 = get_cycles();
+ for (expect = 0xfe, i = 0; i < QUICK_PIT_ITERATIONS; i++) {
+ if (!pit_expect_msb(expect--))
+ goto failed;
+ }
+ t2 = get_cycles();
+
+ if (!pit_expect_msb(expect))
+ goto failed;

That solves the problem on that box and will detect any virtualization
delay as well. I just had to move out the "expect--" from the for() as
gcc is overly clever.

> Of course, then the actual _error_ on the TSC read will be just half that,
> but since there are two TSC reads - one at the beginning and one at the
> end - and if the errors of the two reads go in opposite directions, they
> can add up to 228us.
>
> So I agree - in theory you can have a fairly big error if you hit
> everything just right. In practice, of course, even that *maximal* error
> is actually perfectly fine for TSC calibration.

The above _maximal_ error of 400% is not perfectly fine at all.

> So I just don't care one whit. The fact is, fast bootup is more important
> than some crazy and totally unrealistic VM situation. The 50ms thing was
> already too long, the 250ms one is unbearable.

True. You applied a first draft of my patch right away from mail. It
was the accumulated findings of my detective work on various wreckaged
hardware.

The follow up patches I sent (http://lkml.org/lkml/2008/9/4/254),
bring it down to 10ms in the good and 30ms in the worst case with the
option for 50ms in the last round to make the virtualized/emulated
stuff happy. With that I have not seen any wrong result on any of my
jinxed boxen so far and it works very reliable on virtualized
environments as well.

Combining them with your fast calibration should be a solid and
reasonable fast solution in all corner cases.

Ingo put them into the -tip tree:

git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git x86/tsc


Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/