Re: [RFC][PATCHSET] sanitized pathwalk machinery (v2)

From: Al Viro
Date: Mon Feb 24 2020 - 21:03:15 EST


On Tue, Feb 25, 2020 at 01:24:57AM +0000, Al Viro wrote:

> Incidentally, another inconsistency is LOOKUP_BENEATH treatment in case
> when we have walked out of the subtree by way of e.g. procfs symlink and
> then ran into .. in the absolute root (that's
> if (!follow_up(&nd->path))
> break;
> in follow_dotdot()). Shouldn't that give the same reaction as ..
> in root (EXDEV on LOOKUP_BENEATH, that is)? It doesn't...
>
> Another one is about LOOKUP_NO_XDEV again: suppose you have process'
> root directly overmounted and cwd in the root of whatever's overmounting
> it. Resolution of .. will stay in cwd - we have no parent within the
> chroot jail we are in, so we move to whatever's overmounting that root.
> Which is the original location. Should we fail on LOOKUP_NO_XDEV here?
> Plain .. in the root of chroot jail (not overmounted by anything) does
> *not*...

FWIW, my preference would be the following (for non-RCU case; RCU
one is similar)

get_parent(nd)
{
if (path_equal(&nd->path, &nd->root))
return NULL;
if (nd->path.dentry != nd->path.mnt->mnt_root)
return dget_parent(nd->path.dentry);
m = real_mount(nd->path.mnt);
read_seqlock_excl(&mount_lock);
while (mnt_has_parent(m)) {
d = m->mnt_mountpoint;
m = m->mnt_parent;
if (&m->mnt == nd->root.mnt && d == nd->root.path) // root
break;
if (m->mnt_root != d) {
if (unlikely(nd->flags & LOOKUP_NO_XDEV)) {
read_sequnlock_excl(&mount_lock);
return ERR_PTR(-EXDEV);
}
mntget(&m->mnt);
dget(d);
read_sequnlock_excl(&mount_lock);
path_put(&nd->path);
nd->path.mnt = &m->mnt;
nd->path.dentry = d;
nd->inode = d->d_inode;
return dget_parent(d);
}
}
read_sequnlock_excl(&mount_lock);
return NULL;
}

with follow_dotdot() doing
parent = get_parent(nd);
if (unlikely(IS_ERR(parent)))
return PTR_ERR(parent);
if (unlikely(!parent)) { .. in root is a rare case
bugger off if LOOKUP_BENEATH
parent = dget(nd->path.dentry);
} else if (unlikely(!path_connected(nd->path.mnt, parent))) {
dput(parent);
return -ENOENT;
}
dput(nd->path.dentry);
nd->path.dentry = parent;
follow_mount(&nd->path);

... with the last part replaced with
step_into(nd, WALK_NOFOLLOW, dentry, NULL, 0);
later in this series, with similar in RCU case (only there we would want
inode and seq supplied, as usual, so it would be get_parent_rcu(nd, &inode,
&seq)).