Re: [PATCH V4 0/7] x86/intel_rdt: Intel Cache Allocation Technology

From: Vikas Shivappa
Date: Thu Mar 19 2015 - 18:20:08 EST



Hello Ingo/Peter,

On Thu, 26 Feb 2015, Ingo Molnar wrote:


* Luck, Tony <tony.luck@xxxxxxxxx> wrote:

The CAT thing was annoying already, but at least one
can find that in the SDM, this RDT thing, not a single
mention.

The problems of development at the bleeding edge. Would
you rather Linux sat on the sidelines until there are
enough Google hits from other users of new features?

Well, we'd prefer there to be A) published documentation,
or, lacking published documentation, there be B) a coherent
technical description within the code itself what the
purpose is and how it all works conceptually (minus the
buzzwords), so that we have a common starting point when
reviewing it.


Below is a short summary of the feature ( I have added a detailed documentation in 7/7 patch which fully explains the feature and usage ).

RDT (resource director technology) is the umbrella term which includes monitoring and enforcement of Processor shared resources. Matt's CMT patch takes care of cache monitoring :
https://lkml.kernel.org/r/1422038748-21397-1-git-send-email-matt@xxxxxxxxxxxxxxxxxxxx
The CAT patches support the cache enforcement part.

The cache allocation or enforcement takes place when applications fill L3 cache .Enforcement does not come into action if the cache is already filled and app just reads it.
The enforcement is done via MSR interface. One contains the ID(IA32_PQR_MSR) and other a cache bitmask. The bitmask represents one or more L3 cache ways.
The ID and the bitmask have a 1:1 mapping. When context switch takes place the scheduler just does an MSR write with the ID.

The cgroup would just provide an interface for the user to do the cache enforcement. Hence the cgroup is just used to partition the L3 cache resource. However this partition could be overlapping as the cache bitmasks can overlap. Eventually this cgroup can also be extended to support enforcement of more resources.
The cgroup interface has a 'cbm' file which represents a cgroup's bitmask.
The tasks belonging to a cgroup get to fill in the cache lines represented
by the cgroup's bitmask. Since there are a limited number of IDs available in the hardware, an error would be thrown when user ends up using all of them.


Thanks,
Vikas

Technology: Intel Resource Director Technology

Description: Allows the hypervisor to monitor Last Level Cache usage at the application
and VM levels.

Benefit: Helps to improve performance and efficiency by providing better
information for scheduling, load balancing, and workload migration

Which isn't any help in evaluating this patch series :-(

No, but it already tells us more than the 0/7 description
of the patch series did! It should be possible to improve
on that.

Maintainers reverse engineering the implementation is an
inefficient approach.

Thanks,

Ingo

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/