Re: Perf event operation with hotplug cpus and cgroups

From: William Cohen
Date: Fri Mar 20 2015 - 15:42:04 EST


On 03/20/2015 03:22 PM, Peter Zijlstra wrote:
> On Fri, Mar 20, 2015 at 03:10:39PM -0400, William Cohen wrote:
>> cgroup monitoring
>>
>> The cgroup monitoring is built on the perf event per cpu monitoring.
>> If the cgroup is not pinned to a particular set of processors, then
>> systemwide monitoring for that cgroup needs to be done and a perf
>> event open is needed for every cpu in the system.
>
>> The issue with this
>> approach is if the cgroups are used for virtual machine guests where
>> each cgroup is allocated a single processor, the number of cgroups is
>> proportional to the number of processors in the machine. The number
>> of files that need to be opened to monitor the cgroups on the system
>> is O(cpus^2).
>
> That's what you get for doing silly thing like that, isn't it. Why would
> you create a cgroup per vcpu and then measure that cgroup if you're
> interested in the whole virtual machine?

Hi Peter,

There isn't any desire to aggregate the different cgroup data together. The desired grouping is measurements per cgroup, kind of like the pid scoping for perf but for a cgroup. It is just that the way that the perf event measurements works for cgroups that the measurements need to be taken system-wide.

> Just measure the parent cgroup of the vcpu cgroups if you're really only
> interested in the virtual machine crap thing.
>
>> Given the issues with these uses cases is user-space setting up the
>> counters for each cpu in the system the best solution? Would it be
>> better to to allow the system-wide data collection to selected with
>> one perf event open with pid==-1 and cpu==-1? Is setup of per cpu
>> monitoring and aggregation of the counters across processors too
>> difficult to do in the kernel?
>
> Not hard at all, but useless, you need a fd per cpu in order to get your
> data out. Remember that the ring buffers are strictly per cpu.
>

Are the ring buffers needed just for the sampling or are they also needed "perf stat" type information?

-Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/