[RFC][PATCH 0/2] cgroup: Fix some races against css_set task links
From: Frederic Weisbecker
Date:  Tue Feb 07 2012 - 21:37:37 EST
Hi,
This is trying to fix some races in the way to link all the tasks to
the css_set links. I hope some people can have a look at this, especially
as I'm definetly not an SMP ordering expert.
To give you the big picture, as long as nobody never calls
cgroup_iter_start(), we don't link the tasks to their css_set (this 'link'
is a simple list_add()).
But once somebody calls cgroup_iter_start(), we call
cgroup_enable_task_cg_lists() that grossly does this:
cgroup_enable_task_cg_lists() {
	use_task_set_css_links = 1;
	do_each_thread(g, p) {
		link p to css_set
	} while_each_thread(g, p);
}
But this links only existing tasks, we also need to link all the tasks
that will be created later, this is what does cgroup_post_fork():
cgroup_post_fork() {
	if (use_task_set_css_links)
		link p to css_set
}
So we have some races here:
- cgroup_enable_task_cg_lists() iterates over the tasklist
  without protection. The comments are advertizing we are using RCU
  but we don't. And in fact RCU doesn't yet protect against
  while_each_thread().
- Moreover with RCU there is a risk that we iterate the tasklist but
  we don't immediately see all the last updates that happened. For
  example if a task forks and passes cgroup_post_fork() while
  use_task_set_css_links = 0 then another CPU calling
  cgroup_enable_task_cg_list() can miss the new child while walking the
  tasklist with RCU as it doesn't appear immediately.
- There is no ordering constraint on use_task_set_css_links read/write
  against the tasklist traversal and modification. cgroup_post_fork()
  may deal with a stale value.
The second patch of the series is a proposal to fix the three above
points. Tell me what you think.
Thanks.
Frederic Weisbecker (2):
  cgroup: Remove wrong comment on cgroup_enable_task_cg_list()
  cgroup: Walk task list under tasklist_lock in
    cgroup_enable_task_cg_list
 kernel/cgroup.c |   23 ++++++++++++++++++++---
 1 files changed, 20 insertions(+), 3 deletions(-)
-- 
1.7.5.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/