Re: [RFC PATCH] fs: Add user_file_or_path_at and use it for truncate

From: Al Viro
Date: Wed Aug 28 2013 - 15:59:47 EST


On Wed, Aug 28, 2013 at 12:04:43PM -0700, Andy Lutomirski wrote:
> On Tue, Aug 27, 2013 at 11:16 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> > On Tue, Aug 27, 2013 at 01:28:27PM -0700, Andy Lutomirski wrote:
> >> There are also O_PATH fds, and I'm not sure what the semantics of
> >> O_PATH fds are or should be when they refer to something other than a
> >> directory.
> >
> > O_PATH file just points to specific location in the tree, no more and
> > no less.
>
> I don't know whether ftruncate(some O_PATH fd) should work. But this
> probably barely matters.

It shouldn't. No IO on these guys at all.

> > AFAICS, the *only* cases when we might possibly care are linkat() target,
> > truncate() and open(). Note, BTW, that right now we *do* allow an attempt
> > to reopen a file via procfs symlink r/w, even when file had been r/o.
> > It's subject to permissions on the object being opened, but that's it.
> >
> > I'm not sure we can change that - again, it's a user-visible API, and
> > the change is very likely to break some scripts. In fact, it's about
> > as dangerous as a full-blown switch to dup-style semantics for procfs
> > opens, and it's a lot less attractive.
> >
> > For truncate() we would only need to have FMODE_WRITE reported, more or
> > less the same way as FMODE_FLINK. And without open() changes it doesn't
> > buy us anything at all...
> >
> > I've no problem with unrolling the user_path_at() in do_sys_truncate()
> > into an explicit loop by trailing symlinks and checking for indication
> > left by proc_pid_follow_link(), more or less the same way as with
> > LOOKUP_LINK in lookup_last(). It's _far_ less invasive than playing
> > with "oh, here we fill a struct path or maybe a struct file" horrors,
> > pinning struct file for no reason, etc.
> >
> > AFAICS, the real question is whether we dare to change open() behaviour on
> > /proc/*/fd/*. I've played with that a bit and I believe that I can do
> > the switch to dup-style with very localized changes in fs/namei.c and
> > fs/proc/{base,fd}.c. Will be even binary compatible kernel-side -
> > ->atomic_open() returns NULL/ERR_PTR where it used to return 0/-error,
> > not that we had many instances to convert. *IF* that variant is not
> > out of consideration for userland API stability reasons, I would certainly
> > prefer to go that way; turns out that these days we can pull it off without
> > black magic in descriptor handling, etc. Linus?
>
> I personally find the check-mode-but-get-a-new-struct-file version to
> be less weird that the dup approach. Either approach will break
> scripts that try to write to /dev/stdin (which is the whole point).

What, breaking existing userland? IMO that's a thing to avoid, unless we
have really, really strong reasons not to. And yes, it goes for both
variants... FWIW, I'm not convinced that the reasons you are giving for
it are strong enough - passing somebody a read-only file descriptor to
a file they could open for write and relying on their inability to truncate
the fscker just because it's not reachable via any path they've got search
permissions to looks like a Bloody Bad Idea(tm), and not only because it won't
do what you hope it'll do on existing kernels. It's very easy to fuck up
and end up with a searchable path to the damn thing; e.g. /proc/<pid>/cwd/foo
will bypass the grandparent of foo not being searchable for you, etc.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/