RE: [PATCH] x86/hyper-v: Zero out the VP assist page to fix CPU offlining

From: Michael Kelley
Date: Sun Jul 07 2019 - 21:42:04 EST


From: Dexuan Cui <decui@xxxxxxxxxxxxx> Sent: Wednesday, July 3, 2019 6:46 PM
>
> When a CPU is being offlined, the CPU usually still receives a few
> interrupts (e.g. reschedule IPIs), after hv_cpu_die() disables the
> HV_X64_MSR_VP_ASSIST_PAGE, so hv_apic_eoi_write() may not write the EOI
> MSR, if the apic_assist field's bit0 happens to be 1; as a result, Hyper-V
> may not be able to deliver all the interrupts to the CPU, and the CPU may
> not be stopped, and the kernel will hang soon.
>
> The VP ASSIST PAGE is an "overlay" page (see Hyper-V TLFS's Section
> 5.2.1 "GPA Overlay Pages"), so with this fix we're sure the apic_assist
> field is still zero, after the VP ASSIST PAGE is disabled.
>
> Fixes: ba696429d290 ("x86/hyper-v: Implement EOI assist")
> Signed-off-by: Dexuan Cui <decui@xxxxxxxxxxxxx>
> ---
> arch/x86/hyperv/hv_init.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index 0e033ef11a9f..db51a301f759 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -60,8 +60,14 @@ static int hv_cpu_init(unsigned int cpu)
> if (!hv_vp_assist_page)
> return 0;
>
> + /*
> + * The ZERO flag is necessary, because in the case of CPU offlining
> + * the page can still be used by hv_apic_eoi_write() for a while,
> + * after the VP ASSIST PAGE is disabled in hv_cpu_die().
> + */
> if (!*hvp)
> - *hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL, PAGE_KERNEL);
> + *hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL | __GFP_ZERO,
> + PAGE_KERNEL);
>
> if (*hvp) {
> u64 val;
> --
> 2.19.1

Reviewed-by: Michael Kelley <mikelley@xxxxxxxxxxxxx>