Re: [PATCH v3] x86/kdump: Handle blocked NMIs interrupt to avoid kdump crashes

From: Peter Zijlstra
Date: Tue Feb 14 2023 - 10:47:29 EST


On Tue, Feb 14, 2023 at 05:30:46PM +0800, Zeng Heng wrote:

> > I never remember the shutdown paths -- do we force wipe the PMU
> > registers somewhere before this?
>
> I have checked the panic process, and there is no wipe operation for PMU
> registers,
>
> which causes the watchdog bites.
>
> Do you mean we should directly disable PMU registers instead of calling
> `iret_to_self` to
>
> consume blocked NMI interrupts ?

If you don't wipe the PMU, there will be many and continuous NMIs, a
single IRET-to-SELF isn't going to safe you.

Anyway, I had a bit of a grep around and I find we have:

kernel/events/core.c: register_reboot_notifier(&perf_reboot_notifier);

which should end up killing all the PMU activity. Somewhere around there
there's also a CONFIG_KEXEC_CORE ifdef, so I'm thinking it gets called
on the panic->crash-kernel path too?

If not, someone should look at doing something there.