Re: [PATCH v5] arm64: Enable perf events based hard lockup detector

From: Will Deacon
Date: Tue Jan 26 2021 - 09:19:58 EST


Hi Sumit,

On Fri, Jan 15, 2021 at 05:31:41PM +0530, Sumit Garg wrote:
> With the recent feature added to enable perf events to use pseudo NMIs
> as interrupts on platforms which support GICv3 or later, its now been
> possible to enable hard lockup detector (or NMI watchdog) on arm64
> platforms. So enable corresponding support.
>
> One thing to note here is that normally lockup detector is initialized
> just after the early initcalls but PMU on arm64 comes up much later as
> device_initcall(). So we need to re-initialize lockup detection once
> PMU has been initialized.
>
> Signed-off-by: Sumit Garg <sumit.garg@xxxxxxxxxx>
> ---

[...]

> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index 3605f77a..bafb7c8 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -23,6 +23,8 @@
> #include <linux/platform_device.h>
> #include <linux/sched_clock.h>
> #include <linux/smp.h>
> +#include <linux/nmi.h>
> +#include <linux/cpufreq.h>
>
> /* ARMv8 Cortex-A53 specific event types. */
> #define ARMV8_A53_PERFCTR_PREF_LINEFILL 0xC2
> @@ -1246,12 +1248,30 @@ static struct platform_driver armv8_pmu_driver = {
> .probe = armv8_pmu_device_probe,
> };
>
> +static int __init lockup_detector_init_fn(void *data)
> +{
> + lockup_detector_init();
> + return 0;
> +}
> +
> static int __init armv8_pmu_driver_init(void)
> {
> + int ret;
> +
> if (acpi_disabled)
> - return platform_driver_register(&armv8_pmu_driver);
> + ret = platform_driver_register(&armv8_pmu_driver);
> else
> - return arm_pmu_acpi_probe(armv8_pmuv3_init);
> + ret = arm_pmu_acpi_probe(armv8_pmuv3_init);
> +
> + /*
> + * Try to re-initialize lockup detector after PMU init in
> + * case PMU events are triggered via NMIs.
> + */
> + if (ret == 0 && arm_pmu_irq_is_nmi())
> + smp_call_on_cpu(raw_smp_processor_id(), lockup_detector_init_fn,
> + NULL, false);
> +
> + return ret;

What's wrong with the alternative approach outlined by Mark:

https://lore.kernel.org/r/20210113130235.GB19011@C02TD0UTHF1T.local

?

Will