[PATCH v4 0/2] xen: fix HVM kexec kernel panic

From: Dongli Zhang
Date: Wed Mar 02 2022 - 11:42:38 EST


This is the v4 of the patch to fix xen kexec kernel panic issue when the
kexec is triggered on VCPU >= 32.

PANIC: early exception 0x0e IP 10:ffffffffa96679b6 error 0 cr2 0x20
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.17.0-rc4xen-00054-gf71077a4d84b-dirty #1
... ...
[ 0.000000] RIP: 0010:pvclock_clocksource_read+0x6/0xb0
... ...
[ 0.000000] RSP: 0000:ffffffffaae03e10 EFLAGS: 00010082 ORIG_RAX: 0000000000000000
[ 0.000000] RAX: 0000000000000000 RBX: 0000000000010000 RCX: 0000000000000002
[ 0.000000] RDX: 0000000000000003 RSI: ffffffffaac37515 RDI: 0000000000000020
[ 0.000000] RBP: 0000000000011000 R08: 0000000000000000 R09: 0000000000000001
[ 0.000000] R10: ffffffffaae03df8 R11: ffffffffaae03c68 R12: 0000000040000004
[ 0.000000] R13: ffffffffaae03e50 R14: 0000000000000000 R15: 0000000000000000
[ 0.000000] FS: 0000000000000000(0000) GS:ffffffffab588000(0000) knlGS:0000000000000000
[ 0.000000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.000000] CR2: 0000000000000020 CR3: 00000000ea410000 CR4: 00000000000406a0
[ 0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.000000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 0.000000] Call Trace:
[ 0.000000] <TASK>
[ 0.000000] ? xen_clocksource_read+0x24/0x40
[ 0.000000] ? xen_init_time_common+0x5/0x49
[ 0.000000] ? xen_hvm_init_time_ops+0x23/0x45
[ 0.000000] ? xen_hvm_guest_init+0x221/0x25c
[ 0.000000] ? 0xffffffffa9600000
[ 0.000000] ? setup_arch+0x440/0xbd6
[ 0.000000] ? start_kernel+0x6c/0x695
[ 0.000000] ? secondary_startup_64_no_verify+0xd5/0xdb
[ 0.000000] </TASK>


Changed since v1:
- Add commit message to explain why xen_hvm_init_time_ops() is delayed
for any vcpus. (Suggested by Boris Ostrovsky)
- Add a comment in xen_hvm_smp_prepare_boot_cpu() referencing the related
code in xen_hvm_guest_init(). (suggested by Juergen Gross)
Changed since v2:
- Delay for all VCPUs. (Suggested by Boris Ostrovsky)
- Add commit message that why PVM is not supported by this patch
- Test if kexec/kdump works with mainline xen (HVM and PVM)
Changed since v3:
- Re-use v2 but move the login into xen_hvm_init_time_ops() (Suggested
by Boris Ostrovsky)


I have tested with HVM VM on both old xen and mainline xen.

About the mainline xen, the 'soft_reset' works after I reset d->creation_reset
as suggested by Jan Beulich.

https://lore.kernel.org/all/d3814109-f4ba-9edb-1575-ab94faaeba08@xxxxxxxx/


Thank you very much!

Dongli Zhang