Re: [PATCH v3] x86/kdump: Handle blocked NMIs interrupt to avoid kdump crashes

From: Zeng Heng
Date: Tue Feb 14 2023 - 22:05:22 EST



在 2023/2/15 9:01, Baoquan He 写道:
Add kexec list to CC.

On 02/14/23 at 10:49am, Peter Zijlstra wrote:
On Tue, Feb 14, 2023 at 05:30:46PM +0800, Zeng Heng wrote:

I never remember the shutdown paths -- do we force wipe the PMU
registers somewhere before this?
I have checked the panic process, and there is no wipe operation for PMU
registers,

which causes the watchdog bites.

Do you mean we should directly disable PMU registers instead of calling
`iret_to_self` to

consume blocked NMI interrupts ?
If you don't wipe the PMU, there will be many and continuous NMIs, a
single IRET-to-SELF isn't going to safe you.

Anyway, I had a bit of a grep around and I find we have:

kernel/events/core.c: register_reboot_notifier(&perf_reboot_notifier);

which should end up killing all the PMU activity. Somewhere around there
there's also a CONFIG_KEXEC_CORE ifdef, so I'm thinking it gets called
on the panic->crash-kernel path too?
No, reboot_notifier_list is only handled in kexec reboot/reboot path,
please see kernel_restart_prepare() invocation. Kdump path only shutdown
key component like cpu, interrupt controller.

I would replace iret_to_self() with perf_event_exit_cpu() in kdump shutdown

path (in native_machine_crash_shutdown()).


After test, I would send v4 later.

Thanks all,

Zeng Heng