RE: [PATCH] x86/hyper-v: use cheaper HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,SPACE} hypercalls when possible

From: Michael Kelley (EOSG)
Date: Tue Jun 19 2018 - 13:57:00 EST


> -----Original Message-----
> From: linux-kernel-owner@xxxxxxxxxxxxxxx <linux-kernel-owner@xxxxxxxxxxxxxxx> On Behalf
> Of Vitaly Kuznetsov
> Sent: Friday, June 15, 2018 9:30 AM
> To: x86@xxxxxxxxxx
> Cc: devel@xxxxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; KY Srinivasan
> <kys@xxxxxxxxxxxxx>; Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>; Stephen Hemminger
> <sthemmin@xxxxxxxxxxxxx>; Thomas Gleixner <tglx@xxxxxxxxxxxxx>; Ingo Molnar
> <mingo@xxxxxxxxxx>; H. Peter Anvin <hpa@xxxxxxxxx>; Tianyu Lan
> <Tianyu.Lan@xxxxxxxxxxxxx>
> Subject: [PATCH] x86/hyper-v: use cheaper HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,SPACE}
> hypercalls when possible
>
> While working on Hyper-V style PV TLB flush support in KVM I noticed that
> real Windows guests use TLB flush hypercall in a somewhat smarter way: when
> the flush needs to be performed on a subset of first 64 vCPUs or on all
> present vCPUs Windows avoids more expensive hypercalls which support
> sparse CPU sets and uses their 'cheap' counterparts. This means that
> HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED name is actually a misnomer: EX
> hypercalls (which support sparse CPU sets) are "available", not
> "recommended". This makes sense as they are actually harder to parse.
>
> Nothing stops us from being equally 'smart' in Linux too. Switch to
> doing cheaper hypercalls whenever possible.
>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
> ---

This is a good idea. We should probably do the same with the hypercalls for sending
IPIs -- try the simpler version first and move to the more complex _EX version only
if necessary.

A complication: We've recently found a problem with the code for doing IPI
hypercalls, and the bug affects the TLB flush code as well. As secondary CPUs
are started, there's a window of time where the hv_vp_index entry for a
secondary CPU is uninitialized. We are seeing IPIs happening in that window, and
the IPI hypercall code uses the uninitialized hv_vp_index entry. Same thing could
happen with the TLB flush hypercall code. I didn't actually see any occurrences of
the TLB case in my tracing, but we should fix it anyway in case a TLB flush gets
added at some point in the future.

KY has a patch coming. In the patch, hv_cpu_number_to_vp_number()
and cpumask_to_vpset() can both return U32_MAX if they encounter an
uninitialized hv_vp_index entry, and the code needs to be able to bail out to
the native functions for that particular IPI or TLB flush operation. Once the
initialization of secondary CPUs is complete, the uninitialized situation won't
happen again, and the hypercall path will always be used.

We'll need to coordinate on these patches. Be aware that the IPI flavor of the
bug is currently causing random failures when booting 4.18 RC1 on Hyper-V VMs
with large vCPU counts.

Reviewed-by: Michael Kelley <mikelley@xxxxxxxxxxxxx>