Re: [PATCH] clocksource, prevent overflow in clocksource_cyc2ns

From: Thomas Gleixner
Date: Thu Apr 19 2012 - 08:51:28 EST


On Thu, 19 Apr 2012, Thomas Gleixner wrote:

> On Wed, 18 Apr 2012, Prarit Bhargava wrote:
> > There's also some additional information that I've been gathering on this issue;
> > I have seen *idle* systems switch to the hpet because the clocksource watchdog
> > hits the overflow comparison. As expected it happens much less frequently on
> > newer kernels (linux.git top of tree) than older stable kernels (2.6.32 based)
> > due to the difference in shift values but it is happening in both cases.
> >
> > The odd thing about this behaviour is that I would expect it to occur with the
> > same frequency on small systems as it does on large systems with linux.git as
> > the watchdog fires once/second. AFAICT I do not see this on small systems but
> > see it only on systems with greater than 24 cpus (both Intel and AMD).
> >
> > Using debug code similar to the dump code I previously provided, I can see that
> > every so often these large systems can hit a case where the tsc wraps and the
> > hpet is still monotonically increasing. When the unstable calculation is
> > performed the result is obviously affected by the overflow. Sometimes this
> > comparison overflow happens within 18 minutes, other times it can take hours or
> > days.
>
> You are describing symptoms, but the root cause is obviously that the
> watchdog does not get invoked in time. The question is why.
>
> Can you please add the patch below and enable scheduler, timer and irq
> events in the tracer. Tracing will stop once the watchdog triggers.
>
> Please provide the traces. We need to understand the root cause of
> this idle wreckage.
>
> Thanks,
>
> tglx

-ENOPATCH :)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index c958338..2214323 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -287,11 +287,15 @@ static void clocksource_watchdog(unsigned long data)
cs->cs_last = csnow;
cs->wd_last = wdnow;

+ trace_printk("wd %lld %lld cs %lld %lld\n" , wdnow, wd_nsec,
+ csnow, cs_nsec);
+
if (atomic_read(&watchdog_reset_pending))
continue;

/* Check the deviation from the watchdog clocksource. */
if ((abs(cs_nsec - wd_nsec) > WATCHDOG_THRESHOLD)) {
+ tracing_off();
clocksource_unstable(cs, cs_nsec - wd_nsec);
continue;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/