Re: [PATCH v3] x86/kdump: Handle blocked NMIs interrupt to avoid kdump crashes

From: Baoquan He
Date: Tue Feb 14 2023 - 20:02:29 EST


Add kexec list to CC.

On 02/14/23 at 10:49am, Peter Zijlstra wrote:
> On Tue, Feb 14, 2023 at 05:30:46PM +0800, Zeng Heng wrote:
>
> > > I never remember the shutdown paths -- do we force wipe the PMU
> > > registers somewhere before this?
> >
> > I have checked the panic process, and there is no wipe operation for PMU
> > registers,
> >
> > which causes the watchdog bites.
> >
> > Do you mean we should directly disable PMU registers instead of calling
> > `iret_to_self` to
> >
> > consume blocked NMI interrupts ?
>
> If you don't wipe the PMU, there will be many and continuous NMIs, a
> single IRET-to-SELF isn't going to safe you.
>
> Anyway, I had a bit of a grep around and I find we have:
>
> kernel/events/core.c: register_reboot_notifier(&perf_reboot_notifier);
>
> which should end up killing all the PMU activity. Somewhere around there
> there's also a CONFIG_KEXEC_CORE ifdef, so I'm thinking it gets called
> on the panic->crash-kernel path too?

No, reboot_notifier_list is only handled in kexec reboot/reboot path,
please see kernel_restart_prepare() invocation. Kdump path only shutdown
key component like cpu, interrupt controller.

>
> If not, someone should look at doing something there.
>