Re: [PATCH] perf/x86/amd: cpu_hw_events::perf_ctr_virt_mask should only be used on host

From: Like Xu
Date: Tue Apr 12 2022 - 00:56:11 EST


On 12/4/2022 9:25 am, Dongli Si wrote:
On Mon 11 Apr 2022 22:29:18 +0800, Like Xu wrote:
Or you can work it out to make nested vPMU functional on AMD.
Unless in the future kvm wants to emulate on L1 HV the behavior of not
count when HO bit is set and SVM is disabled, otherwise it doesn't make

In fact, I would prefer that we make a little effort to enable nested vPMU.

sense to use perf_ctr_virt_mask in the guest to mask HO bit. At least for
now, this patch helps clarify what perf_ctr_virt_mask actually does.

This has been clarified by the commit 1018faa6cf23.


This is not a typical revert commit.
Thanks for pointing out the problem with my patch,
I will write another patch specifically to revert this commit.

The indispensable commit df51fe7ea1c1 is concerned with the symmetric use of
'disable_mask' in both __x86_pmu_enable_event() and x86_pmu_disable_event().

I had this false assumption at that time, and we'd better support HOST, GUEST}ONLY
bits in the L1 for L2 guest, and if not, it's bug. Please help. :D


Please check the chronological order of the related commits and the motivations.
I know that commit df51fe7ea1c1c fixed the problem of use vPMU on old KVM,
but I think it's a speculative way and make things a little obscure,
because this #GP is actually a KVM problem rather than a guest problem,

And we have 9b026073db2f to fix KVM for older guest kernel.

I think it is the user's responsibility to update their host kernel.

+ /*
+ * When SVM is disabled, set the Host-Only bit will cause the
+ * performance counter to not work.
It's ridiculous. Based on the AMD APM Table 13-3. Host/Guest Only Bits,
the performance counter would count "Host events" rather than "not work".
You are wrong, you can test it on the host, and the description of the
commit 1018faa6cf23 also pointed out this problem, this is the result of an
experiment, AMD APM has not documented this problem.

I have to say it's true on a ZEN3 host after a quick experiment.


I forgot to say this is the behavior on the host, I will improve this
comment to specify 'why' more clearly, like this:
/*
* It turns out that when SVM is disabled on the host (L0), set the

Again, we need the semantics to hold true on L1.

* Host-Only bit will cause the performance counter to not count.
*/

Note, your proposal change should work on the L0, L1 and L2.
Yes, I tested it on L0, L1, L2 with 5.18-rc1 and it works as expected.

Specifically for L1, we need rely on the EFER[SVME] check instead of the
meaningless "boot_cpu_has(X86_FEATURE_HYPERVISOR)".


There is a related discussion here:
https://lore.kernel.org/all/20220320002106.1800166-1-sidongli1997@xxxxxxxxx/

Regards,
Dongli