Re: [RFC PATCH -tip] cpuacct: per-cgroup utime/stime statistics -v2

From: Bharata B Rao
Date: Tue Mar 17 2009 - 02:23:32 EST


On Mon, Mar 16, 2009 at 04:58:45PM +0800, Li Zefan wrote:
> Bharata B Rao wrote:
> > On Mon, Mar 16, 2009 at 03:13:38PM +0800, Li Zefan wrote:
> >>>>> +/*
> >>>>> + * Account the system/user time to the task's accounting group.
> >>>>> + */
> >>>>> +static void cpuacct_update_stats(struct task_struct *tsk,
> >>>>> + enum cpuacct_stat_index idx, cputime_t val)
> >>>>> +{
> >>>>> + struct cpuacct *ca;
> >>>>> +
> >>>>> + if (unlikely(!cpuacct_subsys.active))
> >>>>> + return;
> >>>>> +
> >>>>> + ca = task_ca(tsk);
> >>>>> +
> >>>>> + do {
> >>>>> + percpu_counter_add(&ca->cpustat[idx], val);
> >>>>> + ca = ca->parent;
> >>>>> + } while (ca);
> >>>>> +}
> >>>>> +
> >>>> IIUC, to make sure accessing "ca" to be safe, we need some condition.
> >>>> (task_lock() or some other.....
> >>> task_lock() protects tsk->cgroups->subsys[]. So can we hold task_lock()
> >>> to protect this walk ? But we do this cpuacct hierarchy walk for the
> >>> current task here. So can a current task's ca or ca's parents disappear
> >>> from under us ?
> >>>
> >> task_ca() should be protected by task_lock() or rcu_read_lock(), otherwise
> >> there is a very small race:
> >>
> >> ca = task_ca(tsk)
> >> move @tsk to another cgroup
> >> rmdir old_cgrp (thus ca is freed)
> >> ca->cpustat <--- accessing freed memory
> >>
> >> As KAMEZAWA-san said all updates are called under preempt-disabled, and
> >> classic and tree rcu's rcu_read_lock does preempt disable only, so above
> >> code is ok, except for rcupreempt.
> >
> > So I will protect task_ca() and ca hierarchy walk with explicit
> > rcu_read_lock() to be fully safe.
> >
>
> either:
>
> rcu_read_lock();
> ca = task_ca(tsk);
> do {
> percpu_counter_add(&ca->cpustat[idx], val);
> ca = ca->parent;
> } while (ca);
> rcu_read_unlock();
>
> or:
>
> rcu_read_lock();
> ca = task_ca(tsk);
> css_get(&ca->css);
> rcu_read_unlock();
> do {
> percpu_counter_add(&ca->cpustat[idx], val);
> ca = ca->parent;
> } while (ca);
> css_put(&ca->css);
>
> which is more efficient?

Went with the first one in my next version of cpuacct stats patch
as I felt it is better than the 2nd version which needs 2 atomic
operations.

>
> > By the same logic, hierarchy walk in cpuacct_charge() is also
> > not safe with rcupreempt. It is under preempt disabled section due
> > to rq->lock. Does cpuacct_charge() also need a fix then ?
> >
>
> I guess so..

Sent a separate fix for this.

Thanks for your review and suggestions.

Regards,
Bharata.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/