Re: [PATCH 0/4] x86: Add Cache QoS Monitoring (CQM) support

From: Peter Zijlstra
Date: Mon Jan 13 2014 - 02:55:47 EST


On Fri, Jan 10, 2014 at 06:55:11PM +0000, Waskiewicz Jr, Peter P wrote:
> I've spoken with the CPU architect, and he's set me straight. I was
> getting some simulation data and reality mixed up, so apologies.
>
> The cacheline is tagged with the RMID being tracked when it's brought
> into the cache. That is the only time it's tagged, it does not get
> updated (I was looking at data showing impacts if it was updated).
>
> If there are frequent RMID updates for a particular process, then there
> is the possibility that any remaining old data for that process can be
> accounted for on a different RMID. This really is workload dependent,
> and my architect provided their data showing that this occurrence is
> pretty much in the noise.

What change frequency and what sided workloads did they test?

I can make it significant; take a multi-threaded workload that mostly
fits in cache, then assign all theads but one RMDI 0, then fairly
quickly rotate RMID 1 between the threads.

The problem is, since there's a limited number of RMIDs we have to
rotate at some point, but since changing RMIDs is nondeterministic we
can't.

> Also, I did ask about the granularity of the RMID, and it is
> per-cacheline. So if there is a non-exclusive cacheline, then the
> occupancy data in the other part of the cacheline will count against the
> RMID.

One more question:

u64 i;
u64 rmid_val[];

for (i = 0; i < rmid_max; i++) {
wrmsr(IA32_QM_EVTSEL, 1 | (i << 32));
rdmsr(IA32_QM_CTR, rmid_val[i]);
}

Is this the right way of reading these values? I couldn't find anything
that says the event must 'run' to accumulate a value at all, so all it
seems it a direct value read with a multiplexer to the RMID.

> > So my current mental model would tag a line with the current (ASSOC)
> > RMID on:
> > - load from DRAM -> L*, even for non-exclusive
> > - any to exclusive transition
> >
> > The result of such rules is that when the effective RMID of a task
> > changes it takes an indeterminate amount of time before the residency
> > stats reflect reality again.
> >
> > Furthermore; the IA32_QM_CTR is a misnomer as its a VALUE not a COUNTER.
> > Not to mention the entire SDM 17.14.2 section is a mess; it purports to
> > describe how to detect the thing using CPUID but then also maybe
> > describes how to program it.
>
> I've given this feedback to the section owner in the SDM. There is an
> update due this month, and there will be some updates to this section
> (along with some additions).
>
> I should have my alternate implementation sent out shortly, just working
> a few kinks out of it. This is the proc-based and sysfs-based interface
> that will rely on a userspace program to handle the logic of grouping
> and assigning stuff together.

I've not figured out how to deal with this stuff yet; exposing RMIDs to
userspace is a guaranteed fail though. Any interface that disallows the
kernel to manage the RMIDs is broken.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/