Re: [PATCH v2 resend] vfs: new O_NODE open flag

From: Miklos Szeredi
Date: Thu Nov 05 2009 - 11:53:38 EST

On Thu, 5 Nov 2009, Alan Cox wrote:
> > > Fortunately you can patch it by hand.
> >
> > How do you patch it by hand?
> I find "joe" quite useful, some people prefer vi or emacs

Emacs, please. But I'm not quite sure anymore what we are talking
about :)

> > "A file descriptor opened with O_NODE | O_NOACCESS may be used to
> > re-open the same file later with increased permissions
> > (e.g. O_RDWR) if the access mode allows. This is true even if the
> > permissions on the path leading up to the file would prevent it"
> Which is contrary to the assumptions made by systems designers for the
> past forty years, so its a very dangerous assumption to break.

I don't know. I would be surprised if anyone actually found a setup
where the current "hole" wrt proc makes any difference.

Take a step back and look at what would be required for this to be

1) a file which is readable and writable by some user
2) but is not reachable by said user (because of permissions on path)
3) a privileged process opening this file with O_RDONLY
4) sending the fd it to an unprivileged process owned by user
5) assuming that user won't be able to write it (even though the file
has write permission)

If I was a system designer, I'd think of that as a very fragile

I'm still not sure which is more harmful, leaving this "hole" as it
is, or fixing it, thereby possibly breaking uses of this (mis)feature.
IMO the latter is far more likely to cause problems.

> What are the sematics with regards to vhangup ?

The file descriptor opened with O_NODE cannot refer to a terminal.

> What are the sematics of O_NODE opening a device file when the device is
> later unloaded and a new device is created on the same node with totally
> unrelated permissions ? [happens all the time btw]

Right, that's a valid point.

So re-opening a device node opened with O_NODE is not safe, I agree.
Which means, I'll either have to remove the possibilty of re-opening
O_NODE files through proc, or limit it to non-device nodes.

I'd really prefer just limiting it, since that would leave re-opening
a useful feature, while having minimal risk (especially if documented)
of it causing trouble.

But I can be convinced either way with sufficiently good arguments :)

> > > But that isn't the case for some things - consider CIFS and other network
> > > file systems.
> >
> > Why?
> open O_NODE
> remote file moves
> new one appears
> reopen


remote file moves
new one appears

What's the difference?

> Now what should happen and what does happen ?

I'd like fstat() to work in that situation as well, but any patches in
that direction consistently get rejected by Christoph, saying
"CIFS/FUSE/etc are broken by design, we shouldn't care".

> > of the volume does not allow any access to the user, so normal
> > open/chdir won't work. Yet open(O_NODE) will and so user can pin the
> > volume.
> and without permission on the node.
> > However, there's not all that much difference between the above and
> > doing "stat()" on the mountpoint in a tight loop, except the former is
> > a more reliable way to prevent unmounting.
> That doesn't seem to be the case testing it, but its fixable trivially if
> so and its fixable without API breakage.

No it's not. While a node is looked up, it will pin the mount, albeit
for just a short time. And permission on the inode itself are not
required for lookup, only permissions on the parent.

I think you are being paranoid here. If the user has access to the
path leading up the mountpoint, he might as well pin that mount.
Permissions on the mount itself shouldn't really make a difference.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at