Re: [git pull] vfs.git part 2

From: Linus Torvalds
Date: Fri Jul 12 2013 - 15:42:24 EST


On Fri, Jul 12, 2013 at 12:13 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> On Fri, Jul 12, 2013 at 04:30:45PM +0000, Rasmus Villemoes wrote:
>>
>> How about simply making O_TMPFILE == O_DIRECTORY | O_RDWR |
>> O_TMPFILE_INTERNAL, and letting the correct use be
>>
>> open("/some/dir", O_TMPFILE) [with or without a mode argument]
>>
> Hrm... I can't say I like it, but it's almost OK; the only problem here
> is the bug fixed by commit bc77daa78 - on some of the old kernels (including
> 3.10, BTW) we used to allow opening /proc/self/fd/0 with O_DIRECTORY|O_RDWR ;-/
>
> Said that, I think it's more tolerable than the kludge I came up with -
> one would need to pass it a procfs symlink as argument to hit that.
> Linus, your opinion?

I think I like it. Because we really shouldn't rely on "the directory
already exists", since it's actually quite possible that it doesn't.
Sure, things like /tmp and /usr/tmp we can generally rely on, but
mkstemp() and friends are often done using TMPDIR etc, so for a
O_TMPFILE we really shouldn't assume that the directory is some
long-term stable and reliable thing.

My only suggestion is that we *enforce* that O_DIRECTORY is set, and
that O_CREAT is not set (the latter is the reverse of what we do now),
so that we don't get programs that "happen" to work on older kernels
(the /proc bug thing I think we can ignore - at least it makes the
possibility of accidental problems much *much* less).

That said, I'm not sure about O_RDWR. There are ways to possibly turn
an fd into a new path, so I could imagine O_WRONLY being useful
("create a temporary file, fill in the content, do fdatasync, then
atomically make it appear in the filesystem with a linkat() system
call").

I'd actually want to at least bring up again the possible requirement
that the pathname argument to O_TEMPFILE must end in a '/'. It would
be an easy check to add, and then we could actually drop the whole
O_DIRECTORY flag, and O_CREAT becomes a non-issue too. The only
downside of that is that it might be very inconvenient for user mode
(eg if user-mode just wants to use the TMPDIR environment variable
directly), so it might well make for a "better" patch for the kernel,
but be much worse as an ABI issue.

As to the mode argument: we should encourage people to have it, since
the inode *may* become visible afterwards. See above (can you do
linkat() to just turn an fd into a name? I didn't really check, but I
think you can do it as a "link(/proc/sef/fd/..)" thing regardless).

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/