Re: [PATCH 8/9] clocksource: Improve unstable clocksource detection

From: John Stultz
Date: Tue Aug 18 2015 - 16:11:58 EST


On Tue, Aug 18, 2015 at 12:28 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> On Tue, 18 Aug 2015, John Stultz wrote:
>> On Tue, Aug 18, 2015 at 1:38 AM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>> > On Mon, 17 Aug 2015, John Stultz wrote:
>> >> On Mon, Aug 17, 2015 at 3:04 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>> >> > On Mon, 17 Aug 2015, John Stultz wrote:
>> >> >
>> >> >> From: Shaohua Li <shli@xxxxxx>
>> >> >>
>> >> >> >From time to time we saw TSC is marked as unstable in our systems, while
>> >> >
>> >> > Stray '>'
>> >> >
>> >> >> the CPUs declare to have stable TSC. Looking at the clocksource unstable
>> >> >> detection, there are two problems:
>> >> >> - watchdog clock source wrap. HPET is the most common watchdog clock
>> >> >> source. It's 32-bit and runs in 14.3Mhz. That means the hpet counter
>> >> >> can wrap in about 5 minutes.
>> >> >> - threshold isn't scaled against interval. The threshold is 0.0625s in
>> >> >> 0.5s interval. What if the actual interval is bigger than 0.5s?
>> >> >>
>> >> >> The watchdog runs in a timer bh, so hard/soft irq can defer its running.
>> >> >> Heavy network stack softirq can hog a cpu. IPMI driver can disable
>> >> >> interrupt for a very long time.
>> >> >
>> >> > And they hold off the timer softirq for more than a second? Don't you
>> >> > think that's the problem which needs to be fixed?
>> >>
>> >> Though this is an issue I've experienced (and tried unsuccessfully to
>> >> fix in a more complicated way) with the RT kernel, where high priority
>> >> tasks blocked the watchdog long enough that we'd disqualify the TSC.
>> >
>> > Did it disqualify the watchdog due to HPET wraparounds (5 minutes) or
>> > due to the fixed threshold being applied?
>>
>> This was years ago, but in my experience, the watchdog false positives
>> were due to HPET wraparounds.
>
> Blocking stuff for 5 minutes is insane ....

Yea. It was usually due to -RT stress testing, which keept the
machines busy for quite awhile. But again, if you have machines being
maxed out with networking load, etc, even for long amounts of time, we
still want to avoid false positives. Because after the watchdog
disqualifies the TSC, the only clocksources left wrap around much
sooner, and we're more likely to then actually lose time during the
next load spike.

Cc'ing Clark and Steven to see if its something they still run into,
and maybe they can help validate the patch.

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/