Re: [PATCH 4/7] KVM: x86/pmu: Not to generate PEBS records for emulated instructions

From: Like Xu
Date: Wed Jul 20 2022 - 22:22:37 EST


On 21/7/2022 8:51 am, Sean Christopherson wrote:
"Don't" instead of "Not to". Not is an adverb, not a verb itself.

On Wed, Jul 13, 2022, Like Xu wrote:
From: Like Xu <likexu@xxxxxxxxxxx>

The KVM accumulate an enabeld counter for at least INSTRUCTIONS or

Probably just "KVM" instead of "the KVM"?

s/enabeld/enabled

Applied, thanks.


BRANCH_INSTRUCTION hw event from any KVM emulated instructions,
generating emulated overflow interrupt on counter overflow, which
in theory should also happen when the PEBS counter overflows but
it currently lacks this part of the underlying support (e.g. through
software injection of records in the irq context or a lazy approach).

In this case, KVM skips the injection of this BUFFER_OVF PMI (effectively
dropping one PEBS record) and let the overflow counter move on. The loss
of a single sample does not introduce a loss of accuracy, but is easily
noticeable for certain specific instructions.

This issue is expected to be addressed along with the issue
of PEBS cross-mapped counters with a slow-path proposal.

Fixes: 79f3e3b58386 ("KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter")
Signed-off-by: Like Xu <likexu@xxxxxxxxxxx>
---
arch/x86/kvm/pmu.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 02f9e4f245bd..08ee0fed63d5 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -106,9 +106,14 @@ static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi)
return;
if (pmc->perf_event && pmc->perf_event->attr.precise_ip) {
- /* Indicate PEBS overflow PMI to guest. */
- skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
- (unsigned long *)&pmu->global_status);
+ if (!in_pmi) {
+ /* The emulated instructions does not generate PEBS records. */

This needs a better comment. IIUC, it's not that they don't generate records,
it's that KVM is _choosing_ to not generate records to hack around a different
bug(s). If that's true a TODO or FIXME would also be nice.

Indeed, to understand more of the context, this part will look like this:

if (!in_pmi) {
/*
* TODO: KVM is currently _choosing_ to not generate records
* for emulated instructions, avoiding BUFFER_OVF PMI when
* there are no records. Strictly speaking, it should be done
* as well in the right context to improve sampling accuracy.
*/
skip_pmi = true;
} else {
/* Indicate PEBS overflow PMI to guest. */
skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
(unsigned long *)&pmu->global_status);
}

, what do you think ?


+ skip_pmi = true;
+ } else {
+ /* Indicate PEBS overflow PMI to guest. */
+ skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
+ (unsigned long *)&pmu->global_status);
+ }
} else {
__set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
}
--
2.37.0