Re: [RFC PATCH v4 17/21] x86/tsc: Switch to perf-based hardlockup detector if TSC become unstable

From: Stephane Eranian
Date: Thu Jun 06 2019 - 20:40:13 EST


Hi Ricardo,
Thanks for your contribution here. It is very important to move the
watchdog out of the PMU wherever possible.

On Thu, May 23, 2019 at 6:17 PM Ricardo Neri
<ricardo.neri-calderon@xxxxxxxxxxxxxxx> wrote:
>
> The HPET-based hardlockup detector relies on the TSC to determine if an
> observed NMI interrupt was originated by HPET timer. Hence, this detector
> can no longer be used with an unstable TSC.
>
> In such case, permanently stop the HPET-based hardlockup detector and
> start the perf-based detector.
>
> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@xxxxxxxxxxxxxxx>
> ---
> arch/x86/include/asm/hpet.h | 2 ++
> arch/x86/kernel/tsc.c | 2 ++
> arch/x86/kernel/watchdog_hld.c | 7 +++++++
> 3 files changed, 11 insertions(+)
>
> diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h
> index fd99f2390714..a82cbe17479d 100644
> --- a/arch/x86/include/asm/hpet.h
> +++ b/arch/x86/include/asm/hpet.h
> @@ -128,6 +128,7 @@ extern int hardlockup_detector_hpet_init(void);
> extern void hardlockup_detector_hpet_stop(void);
> extern void hardlockup_detector_hpet_enable(unsigned int cpu);
> extern void hardlockup_detector_hpet_disable(unsigned int cpu);
> +extern void hardlockup_detector_switch_to_perf(void);
> #else
> static inline struct hpet_hld_data *hpet_hardlockup_detector_assign_timer(void)
> { return NULL; }
> @@ -136,6 +137,7 @@ static inline int hardlockup_detector_hpet_init(void)
> static inline void hardlockup_detector_hpet_stop(void) {}
> static inline void hardlockup_detector_hpet_enable(unsigned int cpu) {}
> static inline void hardlockup_detector_hpet_disable(unsigned int cpu) {}
> +static void harrdlockup_detector_switch_to_perf(void) {}
> #endif /* CONFIG_X86_HARDLOCKUP_DETECTOR_HPET */
>
This does not compile for me when CONFIG_X86_HARDLOCKUP_DETECTOR_HPET
is not enabled.
because:
1- you have a typo on the function name
2- you are missing the inline keyword


> #else /* CONFIG_HPET_TIMER */
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index 59b57605e66c..b2210728ce3d 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -1158,6 +1158,8 @@ void mark_tsc_unstable(char *reason)
>
> clocksource_mark_unstable(&clocksource_tsc_early);
> clocksource_mark_unstable(&clocksource_tsc);
> +
> + hardlockup_detector_switch_to_perf();
> }
>
> EXPORT_SYMBOL_GPL(mark_tsc_unstable);
> diff --git a/arch/x86/kernel/watchdog_hld.c b/arch/x86/kernel/watchdog_hld.c
> index c2512d4c79c5..c8547c227a41 100644
> --- a/arch/x86/kernel/watchdog_hld.c
> +++ b/arch/x86/kernel/watchdog_hld.c
> @@ -76,3 +76,10 @@ void watchdog_nmi_stop(void)
> if (detector_type == X86_HARDLOCKUP_DETECTOR_HPET)
> hardlockup_detector_hpet_stop();
> }
> +
> +void hardlockup_detector_switch_to_perf(void)
> +{
> + detector_type = X86_HARDLOCKUP_DETECTOR_PERF;
> + hardlockup_detector_hpet_stop();
> + hardlockup_start_all();
> +}
> --
> 2.17.1
>