[PATCH v2 0/9] x86/kvm/nVMX: optimize MMU switch between L1 and L2

From: Vitaly Kuznetsov
Date: Tue Sep 25 2018 - 13:59:00 EST


Changes since v1 [Sean Christopherson]:
- drop now unneded local 'vmx' variable from vmx_free_vcpu_nested()
- Rename:
kvm_mmu_scache -> kvm_mmu_extended_role
mmu_role.scache -> mmu_role.ext
mmu_role.base_role -> mmu_role.base
- Add BUILD_BUG_ONs checking MMU role unions sizes.

Original description:

Currently, when we switch from L1 to L2 (VMX) we do the following:
- Re-initialize L1 MMU as shadow EPT MMU (nested_ept_init_mmu_context())
- Re-initialize 'nested' MMU (nested_vmx_load_cr3() -> init_kvm_nested_mmu())

When we switch back we do:
- Re-initialize L1 MMU (nested_vmx_load_cr3() -> init_kvm_tdp_mmu())

This seems to be sub-optimal. Initializing MMU is expensive (thanks to
update_permission_bitmask(), update_pkru_bitmask(),..) Try solving the
issue by splitting L1-normal and L1-nested MMUs and checking if MMU reset
is really needed. This spares us about 1000 cpu cycles on nested vmexit.

Brief look at SVM makes me think it can be optimized the exact same way.
I'll do this in a separate series if nobody objects.

Paolo Bonzini (1):
x86/kvm/mmu: get rid of redundant kvm_mmu_setup()

Vitaly Kuznetsov (8):
x86/kvm/mmu: make vcpu->mmu a pointer to the current MMU
x86/kvm/mmu.c: set get_pdptr hook in kvm_init_shadow_ept_mmu()
x86/kvm/mmu.c: add kvm_mmu parameter to kvm_mmu_free_roots()
x86/kvm/mmu: introduce guest_mmu
x86/kvm/mmu: make space for source data caching in struct kvm_mmu
x86/kvm/nVMX: introduce source data cache for
kvm_init_shadow_ept_mmu()
x86/kvm/mmu: check if tdp/shadow MMU reconfiguration is needed
x86/kvm/mmu: check if MMU reconfiguration is needed in
init_kvm_nested_mmu()

arch/x86/include/asm/kvm_host.h | 44 +++-
arch/x86/kvm/mmu.c | 345 +++++++++++++++++++-------------
arch/x86/kvm/mmu.h | 8 +-
arch/x86/kvm/mmu_audit.c | 12 +-
arch/x86/kvm/paging_tmpl.h | 15 +-
arch/x86/kvm/svm.c | 14 +-
arch/x86/kvm/vmx.c | 46 +++--
arch/x86/kvm/x86.c | 22 +-
8 files changed, 305 insertions(+), 201 deletions(-)

--
2.17.1