Re: [FALSE ALARM] Re: HPET (?) related hangs and breakage in2.6.35,36

From: Thomas Gleixner
Date: Wed Nov 10 2010 - 15:52:50 EST


On Wed, 10 Nov 2010, Andrew Lutomirski wrote:

> On Wed, Nov 10, 2010 at 1:50 PM, Borislav Petkov <bp@xxxxxxxxx> wrote:
> > On Wed, Nov 10, 2010 at 01:48:00PM -0500, Andrew Lutomirski wrote:
> >> > Clocksource: tsc unstable (delta = -34355296774 ns)
> >> > Switching: to clocksource hpet
> >>
> >> Please disregard -- this is a bug in nouveau (or drm) not hpet.  I'll
> >> send a bug report to the maintainers.
> >
> > Interesting! Joerg was complaining about similar symptoms with .36 today
> > too.
>
> Well, there is a clocksource sort-of-bug that could cause confusion:
> when something totally unrelated to clocksources goes out to lunch,
> the clocksource watchdog decides that the clocksource is unstable and
> complains, steering everyone toward filing the wrong bug.

How should the clocksource watchdog code know that something went to
lunch? The fact that we need to monitor TSC at all is horrible enough,
adding further heuristics to detect extended lunch breaks would be
just a PITA.

Maybe we could print a different warning when we see large negative
deltas, which is the main indicator for the system being stuck for
quite a time while TSC advances happily.

Thanks,

tglx