Re: [PATCH v11 14/31] x86/resctrl: Discover hardware telemetry events

From: Reinette Chatre
Date: Wed Oct 08 2025 - 13:12:51 EST


Hi Tony,

On 10/7/25 1:47 PM, Luck, Tony wrote:
> On Mon, Oct 06, 2025 at 02:47:15PM -0700, Luck, Tony wrote:
>> On Mon, Oct 06, 2025 at 02:33:00PM -0700, Reinette Chatre wrote:
>>> Hi Tony,
>>>> static bool get_pmt_feature(enum pmt_feature_id feature, struct event_group **evgs,
>>>> unsigned int num_evg)
>>>> {
>>>> struct pmt_feature_group *p = intel_pmt_get_regions_by_feature(feature);
>>>> struct event_group **peg;
>>>> bool ret = false;
>>>>
>>>> if (IS_ERR_OR_NULL(p))
>>>> return false;
>>>>
>>>> for (peg = evgs; peg < &evgs[num_evg]; peg++) {
>>>> if (enable_events(*peg, p)) {
>>>> kref_get(&p->kref);
>>>
>>> This is not clear to me ... would enable_events() still mark all telemetry_regions
>>> that do not match the event_group's guid as unusable? It seems to me that if more
>>> than one even_group refers to the same pmt_feature_group then the first one to match
>>> will "win" and make the other event_group's telemetry regions unusable.
>>
>> Extra context needed. Sorry.
>>
>> I'm changing enable_events() to only mark telemetry_regions regions as
>> unusable if they have a bad package id, or the MMIO size doesn't match.
>> I.e. they truly are bad.
>>
>> Mis-match on guid will skip then while associating with a specific
>> event_gruoup, but leave them as usable.
>>
>> This means that intel_aet_read_event() now has to check the guid as
>> well as !addr.
>>
>> An alternative approach would be to ask the PMT code for separate
>> copies of the pmt_feature_group to attach to each event_group. I
>> didn't like this, do you think it would be better?
>
> Working through more patches in the series, I've come to the one
> that adjusts the number of RMIDs. The alternative approach of

I see, with the number of RMIDs a property of the event group self this
seems reasonable. While there is duplication of pmt_feature_group I am not
able to tell if this is a big issue since I am not clear on how/if systems
will be built this way.

> having a separate copy of the pmt_feature_group is suddently looking
> more attractive.
>
> So the code would become:
>
>
> static bool get_pmt_feature(enum pmt_feature_id feature, struct event_group **evgs,
> unsigned int num_evg)
> {
> struct pmt_feature_group *p;
> struct event_group **peg;
> bool ret = false;
>
> for (peg = evgs; peg < &evgs[num_evg]; peg++) {
> p = intel_pmt_get_regions_by_feature(feature);
> if (IS_ERR_OR_NULL(p))
> return false;
>
> if (enable_events(*peg, p)) {
> (*peg)->pfg = p;
> ret = true;
> } else {
> intel_pmt_put_feature_group(p);
> }
> }
> intel_pmt_put_feature_group(p);

I am not able to tell why this "put" is needed? I assume the "put" of a
pmt_feature_group assigned to an event_group will still be done in
intel_aet_exit()?

Reinette