Re: [RFC][PATCH] cgroup: fix race between fork and cgroup freezing

From: Li Zefan
Date: Mon Mar 12 2012 - 04:59:03 EST


Tejun Heo wrote:
> Hello, Li.
>
> On Fri, Mar 09, 2012 at 02:26:05PM +0800, Li Zefan wrote:
>> The problem is, forks can happen at any time, so there's no way to prevent
>> forks from happening while iterating tasks in a cgroup, so controllers
>> have to deal with it. In fact freezer is somewhat aware of this issue,
>> that's why it provides the ->fork callback, but there's race.
>>
>> This patch is not too bad (needs a bit modification). cgroup core will detect
>> (via seqcount) if something's happened to a cgroup and the tasks in it, and
>> then cgroup will notify controllers to check if newly-forked tasks should
>> be adjusted accordingly, so they will have consistent status with other tasks
>> in the same cgroup.
>
> But why can't we just do what every sane subsystem would do - link
> first and then invoke notification callback? I mean, we're now
> essentially trying to do the following.
>
> 1. Take some action.
> 2. Trigger notification.
> 3. Link the result of the action to list.
>
> So, of course, if someone tries to traverse the "results", there's a
> race window between #2 and #3. Your fix seems to change the traverser
> to,
>
> 1. Traverse the list.
> 2. If something happened inbetween, take another look.
>
> But, the right thing to do would be changing the fork path to
>
> 1. Take some action.
> 2. Link the result of the action to list.
> 3. Trigger notification.
>

The reasons are

- We still need some kind of locking to syncronize fork and the traverser.
fork side is protected by tasklist_lock, while the traverser takes
css_set_lock.

- After linking the new task to css set list, the task is visible and thus
can be moved to another cgroup, which makes things more complicated and
the subsystem callbacks may have to acquire cgroup_mutex.

- The task_counter subsystem wants to get notified before the new task
is linked, so it's able to abort the fork.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/