Re: [PATCH clocksource 5/6] clocksource: Suspend the watchdog temporarily when high read latency detected

From: Thomas Gleixner
Date: Wed Jan 11 2023 - 06:40:28 EST


On Wed, Jan 04 2023 at 17:07, Paul E. McKenney wrote:
> This can be reproduced by running memory intensive 'stream' tests,
> or some of the stress-ng subcases such as 'ioport'.
>
> The reason for these issues is the when system is under heavy load, the
> read latency of the clocksources can be very high. Even lightweight TSC
> reads can show high latencies, and latencies are much worse for external
> clocksources such as HPET or the APIC PM timer. These latencies can
> result in false-positive clocksource-unstable determinations.
>
> Given that the clocksource watchdog is a continual diagnostic check with
> frequency of twice a second, there is no need to rush it when the system
> is under heavy load. Therefore, when high clocksource read latencies
> are detected, suspend the watchdog timer for 5 minutes.

We should have enough heuristics in place by now to qualify the output of
the clocksource watchdog as a random number generator, right?