Re: [PATCH 1/1] x86/cqm: Cqm requirements

From: David Carrillo-Cisneros
Date: Fri Mar 10 2017 - 20:54:37 EST


>
> Fine. So we need this for ONE particular use case. And if that is not well
> documented including the underlying mechanics to analyze the data then this
> will be a nice source of confusion for Joe User.
>
> I still think that this can be done differently while keeping the overhead
> small.
>
> You look at this from the existing perf mechanics which require high
> overhead context switching machinery. But that's just wrong because that's
> not how the cache and bandwidth monitoring works.
>
> Contrary to the other perf counters, CQM and MBM are based on a context
> selectable set of counters which do not require readout and reconfiguration
> when the switch happens.
>
> Especially with CAT in play, the context switch overhead is there already
> when CAT partitions need to be switched. So switching the RMID at the same
> time is basically free, if we are smart enough to do an equivalent to the
> CLOSID context switch mechanics and ideally combine both into a single MSR
> write.
>
> With that the low overhead periodic sampling can read N counters which are
> related to the monitored set and provide N separate results. For bandwidth
> the aggregation is a simple ADD and for cache residency it's pointless.
>
> Just because perf was designed with the regular performance counters in
> mind (way before that CQM/MBM stuff came around) does not mean that we
> cannot change/extend that if it makes sense.
>
> And looking at the way Cache/Bandwidth allocation and monitoring works, it
> makes a lot of sense. Definitely more than shoving it into the current mode
> of operandi with duct tape just because we can.
>

You made a point. The use case I described can be better served with
the low overhead monitoring groups that Fenghua is working on. Then
that info can be merged with the per-CPU profile collected for non-RDT
events.

I am ok removing the perf-like CPU filtering from the requirements.

Thanks,
David