Re: [PATCH -tip] cpuacct: Make cpuacct hierarchy walk incpuacct_charge() safe when rcupreempt is used.

From: Bharata B Rao
Date: Tue Mar 17 2009 - 23:18:27 EST


On Tue, Mar 17, 2009 at 06:42:51PM +0530, Balbir Singh wrote:
> * Bharata B Rao <bharata@xxxxxxxxxxxxxxxxxx> [2009-03-17 13:06:49]:
>
> > On Tue, Mar 17, 2009 at 02:28:11PM +0800, Li Zefan wrote:
> > > Bharata B Rao wrote:
> > > > cpuacct: Make cpuacct hierarchy walk in cpuacct_charge() safe when
> > > > rcupreempt is used.
> > > >
> > > > cpuacct_charge() obtains task's ca and does a hierarchy walk upwards.
> > > > This can race with the task's movement between cgroups. This race
> > > > can cause an access to freed ca pointer in cpuacct_charge(). This will not
> > >
> > > Actually it can also end up access invalid tsk->cgroups. ;)
> > >
> > > get tsk->cgroups (cg)
> > > (move tsk to another cgroup) or (tsk exiting)
> > > -> kfree(tsk->cgroups)
> > > get cg->subsys[..]
> >
> > Ok :) Here is the patch again with updated description.
> >
> > cpuacct: Make cpuacct hierarchy walk in cpuacct_charge() safe when
> > rcupreempt is used.
> >
> > cpuacct_charge() obtains task's ca and does a hierarchy walk upwards.
> > This can race with the task's movement between cgroups. This race
> > can cause an access to freed ca pointer in cpuacct_charge() or access
> > to invalid cgroups pointer of the task. This will not happen with rcu or
> > tree rcu as cpuacct_charge() is called with preemption disabled. However if
> > rcupreempt is used, the race is seen. Thanks to Li Zefan for explaining this.
> >
> > Fix this race by explicitly protecting ca and the hierarchy walk with
> > rcu_read_lock().
> >
>
> Looks good and works very well (except for the batch issue that you
> pointed out, it takes up to batch values before updates are seen).
>
> I'd like to get the patches in -tip and see the results, I would
> recommend using percpu_counter_sum() while reading the data as an
> enhancement to this patch. If user space does not overwhelm with a lot
> of reads, sum would work out better.
>
>
> Tested-by: Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx>
> Acked-by: Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx>

So I guess this ack is not for this patch but for the per-cgroup
stime/utime cpuacct controller statistics patch.

Regards,
Bharata.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/