Re: [PATCH] kvm: x86: Improve virtual machine startup performance

From: Sean Christopherson
Date: Wed Mar 02 2022 - 20:29:13 EST


On Wed, Mar 02, 2022, Hao Peng wrote:
> On Wed, Mar 2, 2022 at 1:54 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> >
> > On Tue, Mar 01, 2022, Peng Hao wrote:
> > > From: Peng Hao <flyingpeng@xxxxxxxxxxx>
> > >
> > > vcpu 0 will repeatedly enter/exit the smm state during the startup
> > > phase, and kvm_init_mmu will be called repeatedly during this process.
> > > There are parts of the mmu initialization code that do not need to be
> > > modified after the first initialization.
> > >
> > > Statistics on my server, vcpu0 when starting the virtual machine
> > > Calling kvm_init_mmu more than 600 times (due to smm state switching).
> > > The patch can save about 36 microseconds in total.
> > >
> > > Signed-off-by: Peng Hao <flyingpeng@xxxxxxxxxxx>
> > > ---
> > > @@ -5054,7 +5059,7 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
> > > void kvm_mmu_reset_context(struct kvm_vcpu *vcpu)
> > > {
> > > kvm_mmu_unload(vcpu);
> > > - kvm_init_mmu(vcpu);
> > > + kvm_init_mmu(vcpu, false);
> >
> > This is wrong, kvm_mmu_reset_context() is the "big hammer" and is expected to
> > unconditionally get the MMU to a known good state. E.g. failure to initialize
> > means this code:
> >
> > context->shadow_root_level = kvm_mmu_get_tdp_level(vcpu);
> >
> > will not update the shadow_root_level as expected in response to userspace changing
> > guest.MAXPHYADDR in such a way that KVM enables/disables 5-level paging.
> >
> Thanks for pointing this out. However, other than shadow_root_level,
> other fields of context will not
> change during the entire operation, such as
> page_fault/sync_page/direct_map and so on under
> the condition of tdp_mmu.
> Is this patch still viable after careful confirmation of the fields
> that won't be modified?

No, passing around the "init" flag is a hack.

But, we can achieve what you want simply by initializing the constant data once
per vCPU. There's a _lot_ of state that is constant for a given MMU now that KVM
uses separate MMUs for L1 vs. L2 when TDP is enabled. I should get patches posted
tomorrow, just need to test (famous last words).

Also, based on the number of SMM transitions, I'm guessing you're using SeaBIOS.
Have you tried disabling CONFIG_CALL32_SMM, or CONFIG_USE_SMM altogether? That
might be an even better way to improve performance in your environment.

Last question, do you happen to know why eliminating this code shaves 36us? The
raw writes don't seem like they'd take that long. Maybe the writes to function
pointers trigger stalls or mispredicts or something? If you don't have an easy
answer, don't bother investigating, I'm just curious.