Re: [PATCH v1] KVM: x86: PMU Whitelist

From: Wei Wang
Date: Tue Jun 25 2019 - 05:07:25 EST


On 06/25/2019 08:32 AM, Eric Hankland wrote:
Thanks for your feedback - I'll send out an updated version
> incorporating your comments shortly (assuming you don't have more
> after this).

Actually I have another thing to discuss:
probably we could consider to make this filter list white/black configurable
from userspace. For example, userspace option: filter-list=white/black

This reason is that for now, we start with only a couple of events to whitelist.
As people gradually add more (or future hardware has some enhancements to
give you more confidence), finally there may have much more whitelisted events.
Then users could reuse this interface to switch to "filter-list=black",
this will be more efficient, considering the amount of events to enlist and the
iteration of event comparisons.


+struct kvm_pmu_whitelist { + __u64 event_mask;
>>
>> Is this "ARCH_PERFMON_EVENTSEL_EVENT |
>> ARCH_PERFMON_EVENTSEL_UMASK"?
>
> In most cases, I envision this being the case, but it's possible
> users may want other bits - see response to the next question below.
>

Probably we don't need this field to be passed from userspace?

We could directly use AMD64_RAW_EVENTMASK_NB, which includes bit[35:32].
Since those bits are reserved on Intel CPUs, have them as mask should be fine.

Alternatively, we could add this event_mask field to struct kvm_pmu, and initalize
it in the vendor specific intel_pmu_init or amd_pmu_init.

Both options above look good to me.


+ __u16 num_events; + __u64 events[0];
>>
>> Can this be __u16? The lower 16 bits (umask+eventsel) already
>> determines what the event is.
>
> It looks like newer AMD processors also use bits 32-35 for eventsel
> (see AMD64_EVENTSEL_EVENT/AMD64_RAW_EVENT_MASK in
> arch/x86/include/asm/perf_event.h or a recent reference guide),
> though it doesn't look like this has made it to pmu_amd.c in kvm
> yet.

OK, thanks for the reminder on the AMD side. I'm fine to keep it __u64.


> Further, including the whole 64 bits could enable whitelisting some
> events with particular modifiers (e.g. in_tx=0, but not in_tx=1).
> I'm not sure if whitelisting with specific modifiers will be
> necessary,

I think "eventsel+umask" should be enough to identify the event.
Other modifiers could be treated with other options when needed (e.g. AnyThread),
otherwise it would complicate the case (e.g. double/trouble the number of total events).


> but we definitely need more than u16 if we want to support any AMD
> events that make use of those bits in the future.
>
>>> + struct kvm_pmu_whitelist *whitelist;
>>
>> This could be per-VM and under rcu?
> I'll try this out in the next version.
>
>> Why not moving this filter to reprogram_gp_counter?
>>
>> You could directly compare "unit_mask, event_sel" with
>> whitelist->events[i]
> The reason is that this approach provides uniform behavior whether
> an event is programmed on a fixed purpose counter vs a general
> purpose one. Though I admit it's unlikely that instructions
> retired/cycles wouldn't be whitelisted (and ref cycles can't be
> programmed on gp counters), so it wouldn't be missing too much if I
> do move this to reprogram_gp_counter. What do you think?
>

I'd prefer to moving it to reprogram_gp_counter, which
simplifies the implementation (we don't need
.get_event_code as you added in this version.), and it
should also be faster.

Another reason is that we are trying to build an alternative path
of PMU virtualization, which makes the vPMU sits on the hardware
directly, instead of the host perf subsystem. That path wouldn't
go deeper to reprogram_pmc, which invloves the perf subsystem
related implementation, but that path would still need this filter list.

For the fixed counter, we could add a bitmap flag to kvm_arch,
indicating which counter is whitelist-ed based on the
"eventsel+umask" value passed from userspace. This flag is
updated when updating the whitelist-ed events to kvm.
For example, if userspace gives "00+01" (INST_RETIRED_ANY),
then we enable fixed counter0 in the flag.

When reprogram_fixed_counter, we check the flag and return
if the related counter isn't whitelisted.

Best,
Wei