Re: [PATCHv2 5/7] cgroup: introduce cgroup namespaces

From: Aditya Kali
Date: Mon Nov 03 2014 - 18:42:58 EST


On Fri, Oct 31, 2014 at 5:58 PM, Eric W. Biederman
<ebiederm@xxxxxxxxxxxx> wrote:
> Andy Lutomirski <luto@xxxxxxxxxxxxxx> writes:
>
>> On Fri, Oct 31, 2014 at 12:18 PM, Aditya Kali <adityakali@xxxxxxxxxx> wrote:
>
> <snip>
>
>>> +static void *cgroupns_get(struct task_struct *task)
>>> +{
>>> + struct cgroup_namespace *ns = NULL;
>>> + struct nsproxy *nsproxy;
>>> +
>>> + rcu_read_lock();
>>> + nsproxy = task->nsproxy;
>>> + if (nsproxy) {
>>> + ns = nsproxy->cgroup_ns;
>>> + get_cgroup_ns(ns);
>>> + }
>>> + rcu_read_unlock();
>>
>> How is this correct? Other namespaces do it too, so it Must Be
>> Correct (tm), but I don't understand. What is RCU protecting?
>
> The code is not correct. The code needs to use task_lock.
>
> RCU used to protect nsproxy, and now task_lock protects nsproxy.
> For the reasons of of all of this I refer you to the commit
> that changed this, and the comment in nsproxy.h
>

My bad. This should be under task_lock. I will fix it.

> commit 728dba3a39c66b3d8ac889ddbe38b5b1c264aec3
> Author: Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
> Date: Mon Feb 3 19:13:49 2014 -0800
>
> namespaces: Use task_lock and not rcu to protect nsproxy
>
> The synchronous syncrhonize_rcu in switch_task_namespaces makes setns
> a sufficiently expensive system call that people have complained.
>
> Upon inspect nsproxy no longer needs rcu protection for remote reads.
> remote reads are rare. So optimize for same process reads and write
> by switching using rask_lock instead.
>
> This yields a simpler to understand lock, and a faster setns system call.
>
> In particular this fixes a performance regression observed
> by Rafael David Tinoco <rafael.tinoco@xxxxxxxxxxxxx>.
>
> This is effectively a revert of Pavel Emelyanov's commit
> cf7b708c8d1d7a27736771bcf4c457b332b0f818 Make access to task's nsproxy lighter
> from 2007. The race this originialy fixed no longer exists as
> do_notify_parent uses task_active_pid_ns(parent) instead of
> parent->nsproxy.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
>
> Eric



--
Aditya
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/