[PATCH 1/3] kvm: x86: avoid atomic operations on APICv vmentry

From: Paolo Bonzini
Date: Tue Sep 27 2016 - 17:20:32 EST


On some benchmarks (e.g. netperf with ioeventfd disabled), APICv
posted interrupts turn out to be slower than interrupt injection via
KVM_REQ_EVENT.

This patch optimizes a bit the IRR update, avoiding expensive atomic
operations in the common case where PI.ON=0 at vmentry or the PIR vector
is mostly zero. This saves around 20 cycles per vmexit, as
measured by kvm-unit-tests' inl_from_qemu test (20 runs):

| enable_apicv=1 | enable_apicv=0
| mean stdev | mean stdev
----------|-----------------|------------------
before | 5826 32.65 | 5765 47.09
after | 5809 43.42 | 5777 77.02

Of course, any change in the right column is just placebo effect. :)
The savings are bigger if interrupts are frequent.

Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
---
arch/x86/kvm/lapic.c | 6 ++++--
arch/x86/kvm/vmx.c | 3 ++-
2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 23b99f305382..63a442aefc12 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -342,9 +342,11 @@ void __kvm_apic_update_irr(u32 *pir, void *regs)
u32 i, pir_val;

for (i = 0; i <= 7; i++) {
- pir_val = xchg(&pir[i], 0);
- if (pir_val)
+ pir_val = READ_ONCE(pir[i]);
+ if (pir_val) {
+ pir_val = xchg(&pir[i], 0);
*((u32 *)(regs + APIC_IRR + i * 0x10)) |= pir_val;
+ }
}
}
EXPORT_SYMBOL_GPL(__kvm_apic_update_irr);
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 2577183b40d9..b33eee395b00 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4854,7 +4854,8 @@ static void vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);

- if (!pi_test_and_clear_on(&vmx->pi_desc))
+ if (!pi_test_on(&vmx->pi_desc) ||
+ !pi_test_and_clear_on(&vmx->pi_desc))
return;

kvm_apic_update_irr(vcpu, vmx->pi_desc.pir);
--
1.8.3.1