Re: [PATCH v3 3/6] expose fine-grained per-cpu data for cpuacct stats

From: Glauber Costa
Date: Wed May 30 2012 - 08:22:28 EST

Next message: Konstantin Khlebnikov: "Re: 3.4-rc7: BUG: Bad rss-counter state mm:ffff88040b56f800 idx:1val:-59"
Previous message: Frederic Weisbecker: "Re: [PATCH v3 16/28] memcg: kmem controller charge/unchargeinfrastructure"
In reply to: Paul Turner: "Re: [PATCH v3 3/6] expose fine-grained per-cpu data for cpuacct stats"
Next in thread: Paul Turner: "Re: [PATCH v3 3/6] expose fine-grained per-cpu data for cpuacct stats"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 05/30/2012 03:24 PM, Paul Turner wrote:

+static int cpuacct_stats_percpu_show(struct cgroup *cgrp, struct cftype *cft,
> + struct cgroup_map_cb *cb)
> +{
> + struct cpuacct *ca = cgroup_ca(cgrp);
> + int cpu;
> +
> + for_each_online_cpu(cpu) {
> + do_fill_cb(cb, ca, "user", cpu, CPUTIME_USER);
> + do_fill_cb(cb, ca, "nice", cpu, CPUTIME_NICE);
> + do_fill_cb(cb, ca, "system", cpu, CPUTIME_SYSTEM);
> + do_fill_cb(cb, ca, "irq", cpu, CPUTIME_IRQ);
> + do_fill_cb(cb, ca, "softirq", cpu, CPUTIME_SOFTIRQ);
> + do_fill_cb(cb, ca, "guest", cpu, CPUTIME_GUEST);
> + do_fill_cb(cb, ca, "guest_nice", cpu, CPUTIME_GUEST_NICE);
> + }
> +

I don't know if there's much that can be trivially done about it but I
suspect these are a bit of a memory allocation time-bomb on a many-CPU
machine. The cgroup:seq_file mating (via read_map) treats everything
as/one/ record. This means that seq_printf is going to end up
eventually allocating a buffer that can fit_everything_ (as well as
every power-of-2 on the way there). Adding insult to injury is that
that the backing buffer is kmalloc() not vmalloc().

200+ bytes per-cpu above really is not unreasonable (46 bytes just for
the text, plus a byte per base 10 digit we end up reporting), but that
then leaves us looking at order-12/13 allocations just to print this
thing when there are O(many) cpus.

And how's /proc/stat different ?
It will suffer from the very same problems, since it also have this very same information (actually more, since I am skipping some), per-cpu.

Now, if you guys are okay with a file per-cpu, I can do it as well.
It pollutes the filesystem, but at least protects against the fact that this is kmalloc-backed.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Konstantin Khlebnikov: "Re: 3.4-rc7: BUG: Bad rss-counter state mm:ffff88040b56f800 idx:1val:-59"
Previous message: Frederic Weisbecker: "Re: [PATCH v3 16/28] memcg: kmem controller charge/unchargeinfrastructure"
In reply to: Paul Turner: "Re: [PATCH v3 3/6] expose fine-grained per-cpu data for cpuacct stats"
Next in thread: Paul Turner: "Re: [PATCH v3 3/6] expose fine-grained per-cpu data for cpuacct stats"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]