Re: [PATCH v5 0/4] KVM: X86: Paravirt remote TLB flush

From: Konrad Rzeszutek Wilk
Date: Wed Nov 15 2017 - 16:05:53 EST


On Mon, Nov 13, 2017 at 02:01:16AM -0800, Wanpeng Li wrote:
> Remote flushing api's does a busy wait which is fine in bare-metal
> scenario. But with-in the guest, the vcpus might have been pre-empted
> or blocked. In this scenario, the initator vcpu would end up
> busy-waiting for a long amount of time.
>
> This patch set implements para-virt flush tlbs making sure that it
> does not wait for vcpus that are sleeping. And all the sleeping vcpus
> flush the tlb on guest enter. Idea was discussed here:
> https://lkml.org/lkml/2012/2/20/157
>
> The best result is achieved when we're overcommiting the host by running
> multiple vCPUs on each pCPU. In this case PV tlb flush avoids touching
> vCPUs which are not scheduled and avoid the wait on the main CPU.
>
> In addition, thanks for commit 9e52fc2b50d ("x86/mm: Enable RCU based
> page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)")
>
> Test on a Haswell i7 desktop 4 cores (2HT), so 8 pCPUs, running ebizzy
> in one linux guest.

8 pCPUS?
>
> ebizzy -M
> vanilla optimized boost
> 8 vCPUs 10152 10083 -0.68%
> 16 vCPUs 1224 4866 297.5%
> 24 vCPUs 1109 3871 249%
> 32 vCPUs 1025 3375 229.3%

so this is all just one guest? What happens if you have say a 64pCPU
machine with eight of these guests? That is more of a realistic
workload in todays cloud situations.

>
> Note: The patchset is rebased against "locking/qspinlock/x86: Avoid
> test-and-set when PV_DEDICATED is set" v3
>
> v4 -> v5:
> * flushmask instead of cpumask
>
> v3 -> v4:
> * use READ_ONCE()
> * use try_cmpxchg instead of cmpxchg
> * add {} to if
> * no FLUSH flags to preserve during set_preempted
> * "KVM: X86" prefix to patch subject
>
> v2 -> v3:
> * percpu cpumask
>
> v1 -> v2:
> * a new CPUID feature bit
> * fix cmpxchg check
> * use kvm_vcpu_flush_tlb() to get the statistics right
> * just OR the KVM_VCPU_PREEMPTED in kvm_steal_time_set_preempted
> * add a new bool argument to kvm_x86_ops->tlb_flush
> * __cpumask_clear_cpu() instead of cpumask_clear_cpu()
> * not put cpumask_t on stack
> * rebase the patchset against "locking/qspinlock/x86: Avoid
> test-and-set when PV_DEDICATED is set" v3
>
> Wanpeng Li (4):
> KVM: X86: Add vCPU running/preempted state
> KVM: X86: Add paravirt remote TLB flush
> KVM: X86: introduce invalidate_gpa argument to tlb flush
> KVM: X86: Add flush_on_enter before guest enter
>
> Documentation/virtual/kvm/cpuid.txt | 4 ++++
> arch/x86/include/asm/kvm_host.h | 2 +-
> arch/x86/include/uapi/asm/kvm_para.h | 6 +++++
> arch/x86/kernel/kvm.c | 46 ++++++++++++++++++++++++++++++++++--
> arch/x86/kvm/cpuid.c | 3 ++-
> arch/x86/kvm/svm.c | 14 +++++------
> arch/x86/kvm/vmx.c | 21 ++++++++--------
> arch/x86/kvm/x86.c | 25 +++++++++++++-------
> 8 files changed, 88 insertions(+), 30 deletions(-)
>
> --
> 2.7.4
>