Re: [PATCH] clocksource: don't run watchdog forever

From: Thomas Gleixner
Date: Wed Mar 03 2021 - 13:25:14 EST


On Tue, Mar 02 2021 at 20:06, Feng Tang wrote:
> On Tue, Mar 02, 2021 at 10:16:37AM +0100, Peter Zijlstra wrote:
>> On Tue, Mar 02, 2021 at 10:54:24AM +0800, Feng Tang wrote:
>> > clocksource watchdog runs every 500ms, which creates some OS noise.
>> > As the clocksource wreckage (especially for those that has per-cpu
>> > reading hook) usually happens shortly after CPU is brought up or
>> > after system resumes from sleep state, so add a time limit for
>> > clocksource watchdog to only run for a period of time, and make
>> > sure it run at least twice for each CPU.
>> >
>> > Regarding performance data, there is no improvement data with the
>> > micro-benchmarks we have like hackbench/netperf/fio/will-it-scale
>> > etc. But it obviously reduces periodic timer interrupts, and may
>> > help in following cases:
>> > * When some CPUs are isolated to only run scientific or high
>> > performance computing tasks on a NOHZ_FULL kernel, where there
>> > is almost no interrupts, this could make it more quiet
>> > * On a cluster which runs a lot of systems in parallel with
>> > barriers there are always enough systems which run the watchdog
>> > and make everyone else wait
>> >
>> > Signed-off-by: Feng Tang <feng.tang@xxxxxxxxx>
>>
>> Urgh.. so this hopes and prays that the TSC wrackage happens in the
>> first 10 minutes after boot.

which is wishful thinking....

> Yes, the 10 minutes part is only based on our past experience and we
> can make it longer. But if there was real case that the wrackage happened
> long after CPU is brought up like days, then this patch won't help
> much.

It really depends on the BIOS wreckage. On one of my machine it takes up
to a day depending on the workload.

Anything pre TSC_ADJUST wants the watchdog on. With TSC ADJUST available
we can probably avoid it.

There is a caveat though. If the machine never goes idle then TSC adjust
is not able to detect a potential wreckage. OTOH, most of the broken
BIOSes tweak TSC only by a few cycles and that is usually detectable
during boot. So we might be clever about it and schedule a check every
hour when during the first 10 minutes a modification of TSC adjust is
seen on any CPU.

Where is this TSC_DISABLE_WRITE bit again?

Thanks,

tglx