Re: [PATCH v4 08/16] KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to manage guest DS buffer

From: Xu, Like
Date: Fri Apr 09 2021 - 03:07:51 EST


Hi Peter,

On 2021/4/8 15:52, Peter Zijlstra wrote:
This is because in the early part of this function, we have operations:

    if (x86_pmu.flags & PMU_FL_PEBS_ALL)
        arr[0].guest &= ~cpuc->pebs_enabled;
    else
        arr[0].guest &= ~(cpuc->pebs_enabled & PEBS_COUNTER_MASK);

and if guest has PEBS_ENABLED, we need these bits back for PEBS counters:

    arr[0].guest |= arr[1].guest;

I can't keep up with you on this comment and would you explain more ?

I don't think that's right, who's to say they were set in the first
place? The guest's GLOBAL_CTRL could have had the bits cleared at VMEXIT
time. You can't unconditionally add PEBS_ENABLED into GLOBAL_CTRL,
that's wrong.


To address your previous comments, does the code below look good to you?

static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
{
    struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
    struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
    struct debug_store *ds = __this_cpu_read(cpu_hw_events.ds);
    struct kvm_pmu *pmu = (struct kvm_pmu *)data;
    u64 pebs_mask = (x86_pmu.flags & PMU_FL_PEBS_ALL) ?
            cpuc->pebs_enabled : (cpuc->pebs_enabled & PEBS_COUNTER_MASK);
    int i = 0;

    arr[i].msr = MSR_CORE_PERF_GLOBAL_CTRL;
    arr[i].host = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask;
    arr[i].guest = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_host_mask;
    arr[i].guest &= ~pebs_mask;

    if (!x86_pmu.pebs)
        goto out;

    /*
     * If PMU counter has PEBS enabled it is not enough to
     * disable counter on a guest entry since PEBS memory
     * write can overshoot guest entry and corrupt guest
     * memory. Disabling PEBS solves the problem.
     *
     * Don't do this if the CPU already enforces it.
     */
    if (x86_pmu.pebs_no_isolation) {
        i++;
        arr[i].msr = MSR_IA32_PEBS_ENABLE;
        arr[i].host = cpuc->pebs_enabled;
        arr[i].guest = 0;
        goto out;
    }

    if (!pmu || !x86_pmu.pebs_vmx)
        goto out;

    i++;
    arr[i].msr = MSR_IA32_DS_AREA;
    arr[i].host = (unsigned long)ds;
    arr[i].guest = pmu->ds_area;

    if (x86_pmu.intel_cap.pebs_baseline) {
        i++;
        arr[i].msr = MSR_PEBS_DATA_CFG;
        arr[i].host = cpuc->pebs_data_cfg;
        arr[i].guest = pmu->pebs_data_cfg;
    }

    i++;
    arr[i].msr = MSR_IA32_PEBS_ENABLE;
    arr[i].host = cpuc->pebs_enabled & ~cpuc->intel_ctrl_guest_mask;
    arr[i].guest = pebs_mask & ~cpuc->intel_ctrl_host_mask;

    if (arr[i].host) {
        /* Disable guest PEBS if host PEBS is enabled. */
        arr[i].guest = 0;
    } else {
        /* Disable guest PEBS for cross-mapped PEBS counters. */
        arr[i].guest &= ~pmu->host_cross_mapped_mask;
        arr[0].guest |= arr[i].guest;
    }

out:
    *nr = ++i;
    return arr;
}