Re: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpucputime count

From: Balbir Singh
Date: Tue Apr 28 2009 - 18:21:11 EST


* KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> [2009-04-28 15:53:32]:

>
> I'm not cpuacct expert. please give me comment.
>
> ====================
> Subject: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime caching
>
> impact: little performance improvement
>
> cpuacct_update_stats() is called at every tick updating. and it use percpu_counter
> for avoiding performance degression.
>
> Unfortunately, it doesn't works on VIRT_CPU_ACCOUNTING=y environment properly.
> if VIRT_CPU_ACCOUNTING=y, every tick update much than 1000 cputime.
> Thus every percpu_counter_add() makes spinlock grabbing and update non-percpu-variable.
>
> This patch change the batch rule. now, every cpu can store "percpu_counter_bach x jiffies"
> cputime in percpu cache.
> it mean this patch don't have behavior change if VIRT_CPU_ACCOUNTING=n, but
> works well on VIRT_CPU_ACCOUNTING=y too.
>
>
> Cc: Bharata B Rao <bharata@xxxxxxxxxxxxxxxxxx>
> Cc: Balaji Rao <balajirrao@xxxxxxxxx>
> Cc: Dhaval Giani <dhaval@xxxxxxxxxxxxxxxxxx>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
> Cc: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> Cc: Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxx>
> Cc: Martin Schwidefsky <schwidefsky@xxxxxxxxxx>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx>
> ---
> kernel/sched.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> Index: b/kernel/sched.c
> ===================================================================
> --- a/kernel/sched.c 2009-04-28 14:18:36.000000000 +0900
> +++ b/kernel/sched.c 2009-04-28 15:18:07.000000000 +0900
> @@ -10117,6 +10117,7 @@ struct cpuacct {
> };
>
> struct cgroup_subsys cpuacct_subsys;
> +static s32 cpuacct_batch;
>
> /* return cpu accounting group corresponding to this container */
> static inline struct cpuacct *cgroup_ca(struct cgroup *cgrp)
> @@ -10146,6 +10147,9 @@ static struct cgroup_subsys_state *cpuac
> if (!ca->cpuusage)
> goto out_free_ca;
>
> + if (!cpuacct_batch)
> + cpuacct_batch = jiffies_to_cputime(percpu_counter_batch);
> +

I expect cpuacct_batch to be a large number

> for (i = 0; i < CPUACCT_STAT_NSTATS; i++)
> if (percpu_counter_init(&ca->cpustat[i], 0))
> goto out_free_counters;
> @@ -10342,7 +10346,7 @@ static void cpuacct_update_stats(struct
> ca = task_ca(tsk);
>
> do {
> - percpu_counter_add(&ca->cpustat[idx], val);
> + __percpu_counter_add(&ca->cpustat[idx], val, cpuacct_batch);

This will make the end result very off the real value due to large
batch value per cpu. If we are going to go this route, we should
probably consider using __percpu_counter_sum so that the batch value
does not show data that is way off.

> ca = ca->parent;
> } while (ca);
> rcu_read_unlock();
>
>

--
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/