Re: [RFC 1/1] shiftfs: uid/gid shifting bind mount

From: James Bottomley
Date: Thu Feb 16 2017 - 10:52:13 EST


On Wed, 2017-02-15 at 09:17 -0500, Vivek Goyal wrote:
> On Tue, Feb 14, 2017 at 03:45:55PM -0800, James Bottomley wrote:
> > On Tue, 2017-02-14 at 18:03 -0500, Vivek Goyal wrote:

[...]
> > > Given we have already shifted the uid/gid for shiftfs inode, I am
> > > wondering that why can't we simply call
> > > generic_permission(shiftfs_inode, mask) directly in the context
> > > of caller. Something like..
> > >
> > > shiftfs_permission() {
> > > err = generic_permission(inode, mask);
> > > if (err)
> > > return err;
> > >
> > > switch_to_mounter_creds;
> > > err = inode_permission(reali, mask);
> > > revert_creds();
> > >
> > > return err;
> > > }
> >
> > Because if the reali->d_iop->permission exists, you should use it.
> > You could argue shiftfs_permission should be
> >
> > if (iop->permission) {
> > oldcred = shiftfs_new_creds(&newcred, inode->i_sb);
> > err = iop->permission(reali, mask);
> > shiftfs_old_creds(oldcred, &newcred);
> > } else
> > err = generic_permission(inode, mask);
> >
> > But really that's a small optimisation.
>
> ok. I thought using mounter's creds for real inode checks, will
> probably do away with need of modifying caller's user namespace in
> shiftfs_get_up_creds().

Well, no ... the mounter of a marked superblock is container admin, but
the owner in the filesystem view is real root. The unprivileged
mounter's credentials aren't sufficient, therefore.

> cred->fsuid = KUIDT_INIT(from_kuid(sb->s_user_ns, cred->fsuid));
> cred->fsgid = KGIDT_INIT(from_kgid(sb->s_user_ns, cred->fsgid));
> cred->user_ns = ssi->userns;
>
> IIUC, we are shifting caller's fsuid and fsgid into caller's user
> namespace but at the same time using the user_ns of reali->sb
> ->sb_user_ns. Feels little twisted to me. May be I am
> misunderstanding it.

Actually what we're doing is shifting the credentials into the
underlying mount's filesystem view.

> Two levels of checks will simplify this a bit. Top level inode will
> belong to the user namespace of caller and checks should pass. And
> mounter's creds will have ownership over the real inode so no
> additional namespace shifting required there.

That's the problem: for a marked mount, they don't.

> We could also save these creds at mount time once and don't have to
> prepare it for every call. (not sure if it has significant
> performance issue or not). Just thinking aloud...

If it proves to be an issue, I suppose the shift could be cached, but I
really don't think it matters that much.

James