Re: [PATCH -V14 0/11] Generic name to handle and open by handle syscalls

From: Andreas Dilger
Date: Fri Jul 02 2010 - 18:52:56 EST


On 2010-07-02, at 16:09, Neil Brown wrote:
> On Fri, 2 Jul 2010 10:12:47 -0600
> Andreas Dilger <andreas.dilger@xxxxxxxxxx> wrote:
>>
>> I haven't looked at this part of the VFS in a while, but it looks like reconnect_path() is an implementation issue specific to knfsd, and shouldn't be needed for regular files. i.e. if exportfs_encode_fh() is never used on a disconnected file, then this overhead is not incurred.
>>
>> The above use of open_by_handle() is not for userspace NFS/Samba re-export, but to allow applications to open regular files for IO.
>
> Firstly it is needed for directories so that the VFS can effectively lock
> against directory rename races which could otherwise create disconnected
> subtrees (where the first parent is a member only of one of its
> descendants). So if you get a filehandle for a directory it *must* be
> properly connected to the root for rename to be safe. This operation is
> faster than a full path lookup if the dentry is already is cache, and slower
> if it and any of the path is not in cache.

OK, so this requirement is specific for directories, and not at all needed for regular files.

> Secondly it is needed if you want to enforce the rule that the contents of a
> directory are only accessible if the 'x' bit on the directory is set.
> kNFSd does not enforce this (unless subtree_check is specified), partly
> because it is hard to do correctly and partly because we have to trust the
> client any, so trusting it to check the 'x' bit is very little extra trust.

If the application that called name_to_handle() already had to traverse the whole pathname to get the file handle, then there shouldn't necessarily be a requirement to do this when calling open_by_handle(). The only possible permission checking in open_by_handle() is the permission on the inode itself.

> Note that it is not possible to reliably perform filehandle lookup for
> non-directories if you need a fully reconnected dentry, as
> cross-directory-renames can confuse the situation beyond recovery.

For normal file IO, a fully connected dentry is not needed, and in fact the handle_to_path->exportfs_decode_fh() code will accept any inode alias for reguar file use.

> Maybe open-by-handle should require DAC_OVERRIDE, or maybe a new
> DAC_X_OVERRIDE. And if those aren't provided it only works for directories.

That's the big question. If the file handle has some "non-public" information in it (i.e. a capability that cannot be (easily) guessed or forged), then there should not be any need for DAC_OVERRIDE. This could easily be enforced if there was a provision for "short term" file handles that only had to live a few minutes or less, so the kernel could just store a random cookie in each file handle and require applications to get a new handle if the cookie expires or the server crashes.

However, even a "plain" file handle containing only the inode/generation is relatively secure in this respect, since the only way to get the inode number of a particular file is "ls -li" (which either assumes path "x" traversal permission, OR guessing the inode number), and ioctl(FS_IOC_GETVERSION) which requires being able to open the inode already.

Guessing the inode number by itself is fairly weak, at most 2^32 inodes in most filesystems, usually far fewer. Guessing the generation number is much harder (though not impossible).

Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/