Re: [PATCH review 52/85] sunrpc: Properly encode kuids and kgids in auth.unix.gid rpc pipe upcalls.

From: Eric W. Biederman
Date: Thu Feb 14 2013 - 03:42:34 EST


Stanislav Kinsbursky <skinsbursky@xxxxxxxxxxxxx> writes:

> 14.02.2013 03:22, Eric W. Biederman ÐÐÑÐÑ:

> Hmmm...
> Maybe I'm missing the point of user namespaces, but since NFS kernel server
> is controlled via NFSd file system write calls, maybe it would be better to add:
>
> .fs_flags = FS_USERNS_MOUNT
>
> to it and add check:
>
> + if (net->user_ns != current_user_ns())
> + return -EINVAL;
>
> No?

Not really. The immediate goal is to just use kuid_t and kgid_t instead
of uid_t and gid_t throughout the kernel. That ensures someone hasn't
missed a case and is getting a uid in one namespace confused with a uid
in another. And that is needed to make it safe to enable nfs and nfsd
support when user namespace support is enabled in the kernel.

So at the basic level I have made the assumption that all nfs activity
happens in the initial user namespace and have made conversions to/from
the initial user namespace throughout the nfs and nfsd code.

We can add FS_USERNS_MOUNT when we are ready to support running in
multiple user namespaces. For now not allowing mounts outside of the
initial user namespace ensures that the nfs client code is always
in the initial user namespace and that the nfs server code is always
dealing with ids in the initial user namespace.

Stanislav even with your pending patches it won't be possible to mount a
nfsd whre net->user_ns != init_userns. So no bugs will result in the
combination of our patches. The one case I was worried about was
fs/nfs/exports. But since that is read-only it creates no problems.


The big things user namespaces allow (besides uid and gid mapping) is a
context where unprivileged users can create containers. Those
containers can mount and unmount filesystems and have a root user. But
that root users does not have global uid == 0, nor does that root user
have any global capabilities. The root user only has capabilities over
objects created in that user namespace. Which can include network
namespaces etc.




Now all of that said and done when we do start supporting user
namespaces in nfs (something that looks comparitively simple after your
recent work to make nfs and nfsd network namespace aware) I expect the
mount for nfsd and nfs will want to do:
if (net->user_ns != current_user_ns())
return -EINVAL;

I can't see any other cases actually making sense. If we are in an
ancestor user namespace of net->user_ns we are ok permission wise
but totally we are in a totally confused state with respect to which
permission to talk. If we are in a descendent user namespace we should
not have the permissions to potentially dangerous things.

Which should make for a very very simple conversion when to get nfs
running in multiple user namespaces for 3.10 as we can just replace
&init_net with net->user_ns everywhere.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/