Re: [RFC PATCH] fs: add AT_EMPTY_PATH support to unlinkat()

From: Clay Harris
Date: Tue Oct 10 2023 - 01:47:54 EST


Apologies, this message was intended as a reply to Al Viro, but I accidentally
deleted that message so I'm replying to the reply instead.

On Mon, Oct 09 2023 at 16:14:27 +0100, Luis Henriques quoth thus:

> Al Viro <viro@xxxxxxxxxxxxxxxxxx> writes:
>
> > On Fri, Sep 29, 2023 at 03:04:56PM +0100, Luis Henriques wrote:
> >
> >> -int do_unlinkat(int dfd, struct filename *name)
> >> +int do_unlinkat(int dfd, struct filename *name, int flags)
> >> {
> >> int error;
> >> - struct dentry *dentry;
> >> + struct dentry *dentry, *parent;
> >> struct path path;
> >> struct qstr last;
> >> int type;
> >> struct inode *inode = NULL;
> >> struct inode *delegated_inode = NULL;
> >> unsigned int lookup_flags = 0;
> >> -retry:
> >> - error = filename_parentat(dfd, name, lookup_flags, &path, &last, &type);
> >> - if (error)
> >> - goto exit1;
> >> + bool empty_path = (flags & AT_EMPTY_PATH);
> >>
> >> - error = -EISDIR;
> >> - if (type != LAST_NORM)
> >> - goto exit2;
> >> +retry:
> >> + if (empty_path) {
> >> + error = filename_lookup(dfd, name, 0, &path, NULL);
> >> + if (error)
> >> + goto exit1;
> >> + parent = path.dentry->d_parent;
> >> + dentry = path.dentry;
> >> + } else {
> >> + error = filename_parentat(dfd, name, lookup_flags, &path, &last, &type);
> >> + if (error)
> >> + goto exit1;
> >> + error = -EISDIR;
> >> + if (type != LAST_NORM)
> >> + goto exit2;
> >> + parent = path.dentry;
> >> + }
> >>
> >> error = mnt_want_write(path.mnt);
> >> if (error)
> >> goto exit2;
> >> retry_deleg:
> >> - inode_lock_nested(path.dentry->d_inode, I_MUTEX_PARENT);
> >> - dentry = lookup_one_qstr_excl(&last, path.dentry, lookup_flags);
> >> + inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
> >> + if (!empty_path)
> >> + dentry = lookup_one_qstr_excl(&last, parent, lookup_flags);
> >
> > For starters, your 'parent' might have been freed under you, just as you'd
> > been trying to lock its inode. Or it could have become negative just as you'd
> > been fetching its ->d_inode, while we are at it.
> >
> > Races aside, you are changing permissions required for removing files. For
> > unlink() you need to be able to get to the parent directory; if it's e.g.
> > outside of your namespace, you can't do anything to it. If file had been
> > opened there by somebody who could reach it and passed to you (via SCM_RIGHTS,
> > for example) you currently can't remove the sucker. With this change that
> > is no longer true.
> >
> > The same goes for the situation when file is bound into your namespace (or
> > chroot jail, for that matter). path.dentry might very well be equal to
> > root of path.mnt; path.dentry->d_parent might be in part of tree that is
> > no longer visible *anywhere*. rmdir() should not be able to do anything
> > with it...
> >
> > IMO it's fundamentally broken; not just implementation, but the concept
> > itself.
> >
> > NAKed-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx>

Al, thank you for this information. It does shine a little light on where
dragons may be hiding. I was wondering if you could comment on a related
issue.

linkat does allow specifing AT_EMPTY_PATH. However, it requires
CAP_DAC_READ_SEARCH. I saw that a patch went into the kernel to remove
this restriction, but was shortly thereafter reverted with a comment
to the effect of "We'll have to think about this a little more". Then,
radio silence. Other than requiring /proc be mounted to bypass, what
problem does this restriction solve?

Also related, the thing I'm even more interested in is the ability to
create an O_TMPFILE, populate it, set permissions, etc, and then make
it appear in a directory. The problem is I almost always don't want it
to just appear, but rather atomically replace an older version of the
file.

dfd = openat(x, "y", O_RDONLY | O_CLOEXEC | O_DIRECTORY, 0)
fd = openat(dfd, ".", O_RDWR | O_CLOEXEC | O_TMPFILE, 0600)
do stuff with fd
fsync(fd)
linkat(fd, "", dfd, "z", AT_EMPTY_PATH | AT_REPLACE?)
close(fd)
fsync(dfd)
close(dfd)

The AT_REPLACE flag has been suggested before to work around the EEXIST
behavior. Alternatively, renameat2 could accept AT_EMPTY_PATH for
olddirfd/oldpath, but fixing up linkat seems a little cleaner. Without
this, it hardly seems worthwhile to use O_TMPFILE at all, and instead
just go through the hassle of creating the file with a random name
(plus exposing that file and having to possibly rm it in case of an error).

I haven't been able to find any explanation for the AT_REPLACE idea not
gaining traction. Is there some security reason for this?

Thanks


> Thank you for your review, which made me glad I sent out the patch early
> as an RFC. I (think I) understand the issues you pointed out and,
> although some of them could be fixed (the races), I guess there's no point
> pursuing this any further, since you consider the concept itself to be
> broken. Again, thank you for your time.
>
> Cheers,
> --
> Luís