[RFC PATCH] KVM/x86/vPMU: Avoid counter reprogramming in kvm_pmu_handle_event

From: kan . liang
Date: Thu Dec 06 2018 - 14:12:18 EST


From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>

In the process of handling a guest overflow, KVM unconditionally
reprograms perf counters before entering guest. The reprogramming brings
very high overhead. For common case, (e.g. vCPU still runs on the same
CPU), it's unnecessary.

Here is current process of handling an overflow triggered by guest.
The patch intends to avoid the reprogramming in step 2.

PERF (HOST) KVM PERF (GUEST)

1. intel_pmu_handle_irq():
Disable the counter
...
overflow callback
overflow_intr():
request KVM_REQ_PMU
inject PMI to guest
overflow_intr() exit
Enable the counter
intel_pmu_handle_irq() exit
...

2. vcpu_enter_guest():
kvm_pmu_handle_event():
reprogram_counter():
pmc_stop_counter():
Close the counter
pmc_reprogram_counter():
Create a new counter
...

3. intel_pmu_handle_irq():
Disable all counters
pmc_stop_counter():
Close the counter
...
Enable all counters
pmc_reprogram_counter():
Create a new counter
intel_pmu_handle_irq exit

Only when the vcpu moves to another CPU before Step 2, the counter needs
to be reprogrammed for new CPU.
Otherwise, the reprogramming should be avoided. Because there is nothing
changed for perf event. The perf sub-system can take care of the
assigned counter.

The patch doesn't impact the counter value. Because the counter doesn't
count in host anyway.

The patch doesn't impact the behavior of step 3 (guest PMI handler). The
intel_pmu_handle_irq() is just an example. It could be any PMI handler.

The average duration of kvm_pmu_handle_event() is 85,282,765ns on
a 4 socket SKX with one guest which has perf sampling single event.
With the patch, the average duration is only 97ns.

Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
---
arch/x86/kvm/pmu.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 58ead7d..8436f32 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -221,8 +221,8 @@ EXPORT_SYMBOL_GPL(reprogram_counter);
void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+ int bit, event_cpu;
u64 bitmask;
- int bit;

bitmask = pmu->reprogram_pmi;

@@ -234,6 +234,16 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
continue;
}

+ /*
+ * It doesn't need to reprogram the counters unless
+ * the CPU which vcpu runs on has changed.
+ */
+ event_cpu = READ_ONCE(pmc->perf_event->oncpu);
+ if (event_cpu == vcpu->cpu) {
+ clear_bit(bit, (unsigned long *)&pmu->reprogram_pmi);
+ continue;
+ }
+
reprogram_counter(pmu, bit);
}
}
--
2.7.4