Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far

From: Len Brown
Date: Fri Mar 16 2007 - 21:41:13 EST


On Friday 16 March 2007 19:44, Thomas Gleixner wrote:
> Maxim,
>
> On Fri, 2007-03-16 at 12:30 +0200, Maxim Levitsky wrote:
> > 3) Sometimes I get this (once in three boots or so)
> >
> > [ 36.217405] ENABLING IO-APIC IRQs
> > [ 36.217587] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
> > [ 36.433917] APIC timer disabled due to verification failure.
> >
> > And NO_HZ is disabled due to that (I get 1000/s timer's interrupts)
> > I haven't investigated that yet.
> > It looks like another new test that my hardware fails to perform...
>
> Yes, this is probably caused by SMM code trying to emulate a PS/2
> keyboard from a (maybe connected or not) USB keyboard. Unfortunately we
> have no way to disable this BIOS misfeature in the early boot process.
> Arjan, Len ?????

Nope. By definition, SMM is invisible to the OS -- we don't even
get a bit that said it occurred (though we'd like one -- it would
be really helpful to diagnose issues like this one)

So go into BIOS SETUP and see if there is a USB Legacy Emulation
feature that you can disable. Sometimes there is not, but disabling
onboard USB altogether may help at least prove the issue is in that area.

> I built in this test to rule out bogus LAPIC timer calibration values
> which are sometimes off by factor 2-10.
>
> But I also built in a calibration against the PM-Timer, which turned out
> to be quite reliable and I think the additional verification step is
> only necessary for sytems without PM-Timer.
>
> That was a bit over cautious from my side. I send a patch to avoid this
> when PM-Timer is available in a separate mail.

PM-Timer was invented to work-around the issue that the TSC became unreliable
in the face of power management on laptops. In particular, to be able
to time duration of OS idle where TSC stopped.

While it is not fine grain, and it is not low-latency, is should
be very reliable. My understanding is that it is implemented as
a simple divider right off the system 14MHz clock -- the signal
which most motherboard clocks are PLL multiplied up from --
including the 100MHz front-side bus which drives the LAPIC timer.

But that said, I don't understand why calibrating the LAPIC timer
using the PM-timer is going to be more reliable -- exactly how
and why did the previous calibration scheme fail?
Maybe I could follow the new logic in apic.c if I saw the "apic=debug"
output for this box.

cheers,
-Len


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/