Re: [PATCH v3 6/8] KVM: x86/svm/pmu: Add AMD PerfMonV2 support

From: Sean Christopherson
Date: Tue Jan 24 2023 - 19:10:22 EST


On Fri, Nov 11, 2022, Like Xu wrote:
On Fri, Nov 11, 2022, Like Xu wrote:
> @@ -162,20 +179,42 @@ static int amd_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> static void amd_pmu_refresh(struct kvm_vcpu *vcpu)
> {
> struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
> + struct kvm_cpuid_entry2 *entry;
> + union cpuid_0x80000022_ebx ebx;
>
> - if (guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
> - pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS_CORE;
> + pmu->version = 1;
> + if (kvm_cpu_cap_has(X86_FEATURE_AMD_PMU_V2) &&

Why check kvm_cpu_cap support? I.e. what will go wrong if userspace enumerates
PMU v2 to the guest without proper hardware/KVM support.

If this is _necessary_ to protect the host kernel, then we should probably have
a helper to query PMU features, e.g.

static __always_inline bool guest_pmu_has(struct kvm_vcpu *vcpu,
unsigned int x86_feature)
{
return kvm_cpu_cap_has(x86_feature) &&
guest_cpuid_has(vcpu, x86_feature);
}



> + guest_cpuid_has(vcpu, X86_FEATURE_AMD_PMU_V2)) {
> + pmu->version = 2;
> + entry = kvm_find_cpuid_entry_index(vcpu, 0x80000022, 0);
> + ebx.full = entry->ebx;
> + pmu->nr_arch_gp_counters = min3((unsigned int)ebx.split.num_core_pmc,
> + (unsigned int)kvm_pmu_cap.num_counters_gp,
> + (unsigned int)KVM_AMD_PMC_MAX_GENERIC);

Blech. This really shouldn't be necessary, KVM should tweak kvm_pmu_cap.num_counters_gp
as needed during initialization to ensure num_counters_gp doesn't exceed KVM's
internal limits.

Posted a patch[*], please take a look. As mentioned in that thread, I'll somewhat
speculatively apply that series sooner than later so that you can use it a base
for this series (assuming the patch isn't busted).

[*] https://lore.kernel.org/all/20230124234905.3774678-2-seanjc@xxxxxxxxxx

> + }
> +
> + /* Commitment to minimal PMCs, regardless of CPUID.80000022 */

Please expand this comment. I'm still not entirely sure I've interpreted it correctly,
and I'm not sure that I agree with the code.

> + if (kvm_cpu_cap_has(X86_FEATURE_PERFCTR_CORE) &&

AFAICT, checking kvm_cpu_cap_has() is an unrelated change. Either it's a bug fix
and belongs in a separate patch, or it's unnecessary and should be dropped.

> + guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
> + pmu->nr_arch_gp_counters = max_t(unsigned int,
> + pmu->nr_arch_gp_counters,
> + AMD64_NUM_COUNTERS_CORE);
> else
> - pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS;
> + pmu->nr_arch_gp_counters = max_t(unsigned int,
> + pmu->nr_arch_gp_counters,
> + AMD64_NUM_COUNTERS);

Using max() doesn't look right. E.g. if KVM ends up running on some odd setup
where ebx.split.num_core_pmc/kvm_pmu_cap.num_counters_gp is less than
AMD64_NUM_COUNTERS_CORE or AMD64_NUM_COUNTERS.

Or more likely, if userspace says "only expose N counters to this guest".

Shouldn't this be something like?

if (guest_cpuid_has(vcpu, X86_FEATURE_AMD_PMU_V2))
pmu->nr_arch_gp_counters = min(ebx.split.num_core_pmc,
kvm_pmu_cap.num_counters_gp);
else if (guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS_CORE;
else
pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERSE;