Re: [PATCH] KVM: X86: Reduce calls to vcpu_load

From: Sean Christopherson
Date: Wed Sep 06 2023 - 16:08:36 EST


On Wed, Sep 06, 2023, Xiaoyao Li wrote:
> On 9/6/2023 2:24 PM, Hao Peng wrote:
> > From: Peng Hao <flyingpeng@xxxxxxxxxxx>
> >
> > The call of vcpu_load/put takes about 1-2us. Each
> > kvm_arch_vcpu_create will call vcpu_load/put
> > to initialize some fields of vmcs, which can be
> > delayed until the call of vcpu_ioctl to process
> > this part of the vmcs field, which can reduce calls
> > to vcpu_load.
>
> what if no vcpu ioctl is called after vcpu creation?
>
> And will the first (it was second before this patch) vcpu_load() becomes
> longer? have you measured it?

I don't think the first vcpu_load() becomes longer, this avoids an entire
load()+put() pair by doing the initialization in the first ioctl().

That said, the patch is obviously buggy, it hooks kvm_arch_vcpu_ioctl() instead
of kvm_vcpu_ioctl(), e.g. doing KVM_RUN, KVM_SET_SREGS, etc. will cause explosions.

It will also break the TSC synchronization logic in kvm_arch_vcpu_postcreate(),
which can "race" with ioctls() as the vCPU file descriptor is accessible by
userspace the instant it's installed into the fd tables, i.e. userspace doesn't
have to wait for KVM_CREATE_VCPU to complete.

And I gotta imagine there are other interactions I haven't thought of off the
top of my head, e.g. the vCPU is also reachable via kvm_for_each_vcpu(). All it
takes is one path that touches a lazily initialized field for this to fall apart.

> I don't think it worth the optimization unless a strong reason.

Yeah, this is a lot of subtle complexity to shave 1-2us.