Re: [git pull] mount API series

From: Al Viro
Date: Mon Nov 12 2018 - 15:54:48 EST


On Sun, Nov 11, 2018 at 08:07:20PM -0600, Eric W. Biederman wrote:
> Steven Whitehouse <swhiteho@xxxxxxxxxx> writes:

> > Can you share some details of what this NULL dereference is? David and
> > Al have been working on the changes as requested by Linus later in
> > this thread, and they'd like to tidy up this issue too at the same
> > time if possible. We are not asking you to actually provide a fix, in
> > case you are too busy to do so, however it would be good to know what
> > the issue is so that we can make sure that it is resolved in the next
> > round of patches,
>
> https://lore.kernel.org/lkml/87bm7n5k1r.fsf@xxxxxxxxxxxx/

Thought it had been dealt with, but you are right - oops is still there
and obviously needs fixing. However, looking at that place in mainline...
How does that thing manage to work? Look:
/* Notice when we are propagating across user namespaces */
if (m->mnt_ns->user_ns != user_ns)
type |= CL_UNPRIVILEGED;
child = copy_tree(last_source, last_source->mnt.mnt_root, type);
if (IS_ERR(child))
return PTR_ERR(child);
child->mnt.mnt_flags &= ~MNT_LOCKED;
mnt_set_mountpoint(m, mp, child);
last_dest = m;
last_source = child;
OK, we'd created the copy to be attached to the next candidate mountpoint.
If we have e.g. a 4-element peer group, we'll start with what we'd been
asked to mount, then call that sucker three times, getting a copy for
each of those mountpoints. Right? Now, what happens if the 1st, 3rd and
4th members live in your namespace, with the second one being elsewhere?
We have
source_mnt: that'll go on top of the 1st mountpoint
copy of source_mnt: that'll go on top of the 2nd mountpoint
copy of copy of source_mnt: that'll go on top of the 3rd mountpoint
copy of copy of copy of source_mnt: that'll go on top of the 4th one
And AFAICS your logics there has just made sure that everything except the
source_mnt will have MNT_LOCK_... all over the place. IOW, the effect of
CL_UNPRIVELEGED is cumulative.

How the hell does that code avoid this randomness? Note had the members of
that peer group been in a different order, you would've gotten a different result.
What am I missing here?

Oops is a separate story, and a regression in its own right; it needs to be
fixed. But I would really like to sort out the semantics of the existing
code in that area, so that we don't end up with patch conflicts.