[PATCH] KVM: x86/pmu: Clear reserved bit PERF_CTL2[43] for AMD erratum 1292

From: Like Xu
Date: Mon Jan 17 2022 - 00:57:17 EST


From: Like Xu <likexu@xxxxxxxxxxx>

The AMD Family 19h Models 00h-0Fh Processors may experience sampling
inaccuracies that cause the following performance counters to overcount
retire-based events. To count the non-FP affected PMC events correctly,
a patched guest with a target vCPU model would:

- Use Core::X86::Msr::PERF_CTL2 to count the events, and
- Program Core::X86::Msr::PERF_CTL2[43] to 1b, and
- Program Core::X86::Msr::PERF_CTL2[20] to 0b.

To support this use of AMD guests, KVM should not reserve bit 43
only for counter #2. Treatment of other cases remains unchanged.

Note, the host's perf subsystem will decide which hardware counter
will be used for the guest counter, based on its own physical CPU
model and its own workaround(s) in the host perf context.

Reported-by: Jim Mattson <jmattson@xxxxxxxxxx>
Signed-off-by: Like Xu <likexu@xxxxxxxxxxx>
---
arch/x86/kvm/svm/pmu.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index 12d8b301065a..1111b12adcca 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -18,6 +18,17 @@
#include "pmu.h"
#include "svm.h"

+/*
+ * As a workaround of "Retire Based Events May Overcount" for erratum 1292,
+ * some patched guests may set PERF_CTL2[43] to 1b and PERF_CTL2[20] to 0b
+ * to count the non-FP affected PMC events correctly.
+ */
+static inline bool vcpu_overcount_retire_events(struct kvm_vcpu *vcpu)
+{
+ return guest_cpuid_family(vcpu) == 0x19 &&
+ guest_cpuid_model(vcpu) < 0x10;
+}
+
enum pmu_type {
PMU_TYPE_COUNTER = 0,
PMU_TYPE_EVNTSEL,
@@ -252,6 +263,7 @@ static int amd_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
struct kvm_pmc *pmc;
u32 msr = msr_info->index;
u64 data = msr_info->data;
+ u64 reserved_bits;

/* MSR_PERFCTRn */
pmc = get_gp_pmc_amd(pmu, msr, PMU_TYPE_COUNTER);
@@ -264,7 +276,10 @@ static int amd_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
if (pmc) {
if (data == pmc->eventsel)
return 0;
- if (!(data & pmu->reserved_bits)) {
+ reserved_bits = pmu->reserved_bits;
+ if (pmc->idx == 2 && vcpu_overcount_retire_events(vcpu))
+ reserved_bits &= ~BIT_ULL(43);
+ if (!(data & reserved_bits)) {
reprogram_gp_counter(pmc, data);
return 0;
}
--
2.33.1