Re: [PATCH 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

From: Radim KrÄmÃÅ
Date: Wed Nov 08 2017 - 13:41:31 EST


2017-10-31 10:02-0700, Eduardo Valentin:
> Hello Radim,
>
> On Tue, Oct 24, 2017 at 01:18:59PM +0200, Radim KrÄmÃÅ wrote:
> > 2017-10-23 17:44-0700, Eduardo Valentin:
> > > Currently, the existing qspinlock implementation will fallback to
> > > test-and-set if the hypervisor has not set the PV_UNHALT flag.
> >
> > Where have you detected the main source of overhead with pinned VCPUs?
> > Makes me wonder if we couldn't improve general PV_UNHALT,
>
> This is essentially for cases of non-overcommitted vCPUs in which we want
> the instance vCPUs to run uninterrupted as much as possible. Here by disabling
> the PV_UNHALT, we avoid the accounting needed to properly do the PV_UNHALT
> hypercall, as the lock holder won't be preempted anyway for the 1:1 pin case.

Right, I would expect that the scenario should very rarely go into the
halt/kick path -- is SPIN_THRESHOLD too low?

We could also try abolishing the SPIN_THRESHOLD completely and only use
vcpu_is_preempted() and state of the previous lock holder to enter the
halt/kick path.

(The drawback is that vcpu_is_preempted() currently gets set even when
dropping into userspace.)

> > > This patch gives the opportunity to guest kernels to select
> > > between test-and-set and the regular queueu fair lock implementation
> > > based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> > > flag is not set, the code will still fall back to test-and-set,
> > > but when the PV_DEDICATED flag is set, the code will use
> > > the regular queue spinlock implementation.
> >
> > Some flag makes sense and we do want to make sure that userspaces don't
> > enable it in pass-through-cpuid mode.
>
> Did you mean something like:
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 0099e10..8ceb503 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -211,7 +211,8 @@ int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
> }
> for (i = 0; i < cpuid->nent; i++) {
> vcpu->arch.cpuid_entries[i].function = cpuid_entries[i].function;
> - vcpu->arch.cpuid_entries[i].eax = cpuid_entries[i].eax;
> + vcpu->arch.cpuid_entries[i].eax = cpuid_entries[i].eax &
> + ~KVM_FEATURE_PV_DEDICATED;
> vcpu->arch.cpuid_entries[i].ebx = cpuid_entries[i].ebx;
> vcpu->arch.cpuid_entries[i].ecx = cpuid_entries[i].ecx;
> vcpu->arch.cpuid_entries[i].edx = cpuid_entries[i].edx;
>
>
> But I do not see any other KVM_FEATURE_* being enforced (e.g. PV_UNHALT).
> Do you mind elaborating a bit here?

Sorry, nothing is needed. I somehow though that we need to expose this
to the userspace through CPUID, but KVM just needs to consider the flag
as reserved.