[PATCH] x86/hyperv: Free VP assist page from hv_cpu_die()

From: Vitaly Kuznetsov
Date: Thu Nov 03 2022 - 11:27:41 EST


Normally, 'hv_vp_assist_page[cpu]' points to CPU's VP assist page mapping.
In case of Hyper-V root partition, this is 'memremap()' of the PFN given
by the hypervisor. In case of a non-root partition, it's vmalloc(). When
the CPU goes offline, hv_cpu_die() disables VP assist page by writing
HV_X64_MSR_VP_ASSIST_PAGE and in case of root partition, does memunmap().
For non-root partitions, the vmalloc()ed page remains allocated and
thus hv_cpu_init() has to check whether a new allocation is
needed. This is unnecessary complicated. Let's always free the page
from hv_cpu_die() and allocate it back from hv_cpu_init(). All VP
assist page users have to be prepared to 'hv_vp_assist_page[cpu]'
becoming NULL anyway as that's what happes already for the root
partition.

VP assist page has two users: KVM and APIC PV EOI. When a CPU goes
offline, there cannot be a running guest and thus KVM's use case
should be safe. As correctly noted in commit e320ab3cec7dd ("x86/hyper-v:
Zero out the VP ASSIST PAGE on allocation"), it is possible to see
interrupts after hv_cpu_die() and before the CPU is fully
dead. hv_apic_eoi_write() is, however, also prepared to see NULL in
'hv_vp_assist_page[smp_processor_id()]'. Moreover, checking the
page which is already unmapped from the hypervisor is incorrect in the
first place.

While on it, adjust VP assist page disabling a bit: always write to
HV_X64_MSR_VP_ASSIST_PAGE first and unmap/free the corresponding page
after, this is to make sure the hypervisor doesn't write to the
already freed memory in the interim.

Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
---
arch/x86/hyperv/hv_init.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index a0165df3c4d8..74be6f145fc4 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -104,8 +104,7 @@ static int hv_cpu_init(unsigned int cpu)
* in hv_cpu_die(), otherwise a CPU may not be stopped in the
* case of CPU offlining and the VM will hang.
*/
- if (!*hvp)
- *hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL | __GFP_ZERO);
+ *hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL | __GFP_ZERO);
if (*hvp)
msr.pfn = vmalloc_to_pfn(*hvp);

@@ -233,12 +232,17 @@ static int hv_cpu_die(unsigned int cpu)
* page here and nullify it, so that in future we have
* correct page address mapped in hv_cpu_init.
*/
- memunmap(hv_vp_assist_page[cpu]);
- hv_vp_assist_page[cpu] = NULL;
rdmsrl(HV_X64_MSR_VP_ASSIST_PAGE, msr.as_uint64);
msr.enable = 0;
}
wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, msr.as_uint64);
+
+ if (hv_root_partition)
+ memunmap(hv_vp_assist_page[cpu]);
+ else
+ vfree(hv_vp_assist_page[cpu]);
+
+ hv_vp_assist_page[cpu] = NULL;
}

if (hv_reenlightenment_cb == NULL)
--
2.38.1


--=-=-=--