RE: [PATCH] x86/hyperv: Suspend/resume the VP assist page for hibernation

From: Dexuan Cui
Date: Fri Apr 17 2020 - 19:47:54 EST


> From: Wei Liu <wei.liu@xxxxxxxxxx>
> Sent: Friday, April 17, 2020 4:00 AM
> To: Dexuan Cui <decui@xxxxxxxxxxxxx>
>
> On Thu, Apr 16, 2020 at 11:29:59PM -0700, Dexuan Cui wrote:
> > Unlike the other CPUs, CPU0 is never offlined during hibernation. So in the
> > resume path, the "new" kernel's VP assist page is not suspended (i.e.
> > disabled), and later when we jump to the "old" kernel, the page is not
> > properly re-enabled for CPU0 with the allocated page from the old kernel.
> >
> > So far, the VP assist page is only used by hv_apic_eoi_write(). When the
> > page is not properly re-enabled, hvp->apic_assist is always 0, so the
> > HV_X64_MSR_EOI MSR is always written. This is not ideal with respect to
> > performance, but Hyper-V can still correctly handle this.
> >
> > The issue is: the hypervisor can corrupt the old kernel memory, and hence
> > sometimes cause unexpected behaviors, e.g. when the old kernel's non-boot
> > CPUs are being onlined in the resume path, the VM can hang or be killed
> > due to virtual triple fault.
>
> I don't quite follow here.
>
> The first sentence is rather alarming -- why would Hyper-V corrupt
> guest's memory (kernel or not)?

Without this patch, after the VM resumes from hibernation, the hypervisor
still thinks the assist page of vCPU0 points to the physical page allocated by
the "new" kernel (the "new" kernel started up freshly, loaded the saved state
of the "old" kernel from disk into memory, and jumped to the "old" kernel),
but the same physical page can be allocated to store something different in
the "old" kernel (which is the currently running kernel, since the VM resumed).

Conceptually, it looks Hyper-V writes into the assist page from time to time,
e.g. for the EOI optimization. This "corrupts" the page for the "old" kernel.

I'm not absolutely sure if this explains the strange hang issue or triple fault
I occasionally saw in my long-haul hibernation test, but with this patch,
I never reproduce the strange hang/triple fault issue again, so I think this
patch works.

> Secondly, code below only specifies cpu0. What does it do with non-boot
> cpus on the resume path?
>
> Wei.

hyperv_init() registers hv_cpu_init()/hv_cpu_die() to the cpuhp framework:

cpuhp = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/hyperv_init:online",
hv_cpu_init, hv_cpu_die);

In the hibernation procedure, the non-boot CPUs are automatically disabled
and reenabled, so hv_cpu_init()/hv_cpu_die() are automatically called for them,
e.g. in the resume path, see:
hibernation_restore()
resume_target_kernel()
hibernate_resume_nonboot_cpu_disable()
disable_nonboot_cpus()
syscore_suspend()
hv_cpu_die(0) // Added by this patch
swsusp_arch_resume()
relocate_restore_code()
restore_image()
jump to the old kernel and we return from
the swsusp_arch_suspend() in create_image()
syscore_resume()
hv_cpu_init(0) // Added by this patch.
suspend_enable_secondary_cpus()
dpm_resume_start()
...
Thanks,
-- Dexuan