Re: Should arm64 have a custom crash shutdown handler?

From: Vitaly Kuznetsov
Date: Thu May 05 2022 - 09:53:24 EST


"Guilherme G. Piccoli" <gpiccoli@xxxxxxxxxx> writes:

> On 05/05/2022 09:53, Mark Rutland wrote:
>> [...]
>> Looking at those, the cleanup work is all arch-specific. What exactly would we
>> need to do on arm64, and why does it need to happen at that point specifically?
>> On arm64 we don't expect as much paravirtualization as on x86, so it's not
>> clear to me whether we need anything at all.
>>
>>> Anyway, the idea here was to gather a feedback on how "receptive" arm64
>>> community would be to allow such customization, appreciated your feedback =)
>>
>> ... and are you trying to do this for Hyper-V or just using that as an example?
>>
>> I think we're not going to be very receptive without a more concrete example of
>> what you want.
>>
>> What exactly do *you* need, and *why*? Is that for Hyper-V or another hypervisor?
>>
>> Thanks
>> Mark.
>
> Hi Mark, my plan would be doing that for Hyper-V - kind of the same
> code, almost. For example, in hv_crash_handler() there is a stimer
> clean-up and the vmbus unload - my understanding is that this same code
> would need to run in arm64. Michael Kelley is CCed, he was discussing
> with me in the panic notifiers thread and may elaborate more on the needs.
>
> But also (not related with my specific plan), I've seen KVM quiesce code
> on x86 as well [see kvm_crash_shutdown() on arch/x86] , I'm not sure if
> this is necessary for arm64 or if this already executing in some
> abstracted form, I didn't dig deep - probably Vitaly is aware of that,
> hence I've CCed him here.

Speaking about the difference between reboot notifiers call chain and
machine_ops.crash_shutdown for KVM/x86, the main difference is that
reboot notifier is called on some CPU while the VM is fully functional,
this way we may e.g. still use IPIs (see kvm_pv_reboot_notify() doing
on_each_cpu()). When we're in a crash situation,
machine_ops.crash_shutdown is called on the CPU which crashed. We can't
count on IPIs still being functional so we do the very basic minimum so
*this* CPU can boot kdump kernel. There's no guarantee other CPUs can
still boot but normally we do kdump with 'nprocs=1'.

For Hyper-V, the situation is similar: hv_crash_handler() intitiates
VMbus unload on the crashing CPU only, there's no mechanism to do
'global' unload so other CPUs will likely not be able to connect Vmbus
devices in kdump kernel but this should not be necessary.

There's a crash_kexec_post_notifiers mechanism which can be used instead
but it's disabled by default so using machine_ops.crash_shutdown is
better.

--
Vitaly