Re: [RFC] atomic create+open

From: Miklos Szeredi
Date: Fri Oct 07 2005 - 09:01:53 EST


> > > > I just think that filesystem code should _never_ need to care about
> > > > mounts. If you want to do the lookup+open, you somehow will have to
> > > > deal with mounts, which is ugly.
> > >
> > > You appear to think that atomic lookup+open is a question of choice. It
> > > is not.
> >
> > Atomic lookup+open is an optimization, and as such a question of
> > choice. Atomic create+open is not.
>
> Really? Under NFSv4, the one and only OPEN command does an atomic lookup
> +open, It _has to_ in order to deal with all the races.
>
> Once that is the case, then separating lookup and open into two
> operations means that you need to worry about namespace changes on the
> server too (since OPEN takes a name argument rather than a filehandle).

So, you are saying OPEN has to do the lookup too. That's OK, but that
_does not_ mean that you have to do the OPEN operation from the
->lookup() or ->d_revalidate() methods. In fact you cannot do the
later without getting into trouble over mounts.

> If you end up opening a different file to the one you looked up, things
> can get very interesting.

You can replace the inode in ->create_open() if you want to. Or let
the VFS redo the lookup (as if d_revalidate() returned 0).

> > I know you are thinking of the non-exclusive create case when between
> > the lookup and the open the file is removed or transmuted on the
> > server..
>
> > Yes, it's tricky to sovle, but by no means impossible without atomic
> > lookup+open. E.g. consider this pseudo-code (only the atomic
> > open+create case) in open_namei():
>
> Firstly, that pseudo-code doesn't deal at all with the race you describe
> above. It only deals with lookup + file creation.

It does deal with the race:

__lookup_hash() returns a positive dentry

file is removed on server

"else if (!(flag & O_EXCL) && may_create(dir))" condition is met

__follow_mount() return false

vfs_create_open() calls ->create_open()

NFS does OPEN (O_CREAT), file is opened, dentry replaced (this is not
ellaborated in the pseudocode).


> Secondly, it also fails to deal with the issue of propagation of open
> context.
> If you open a file, then that creates open context/state on the server.
> Most protocols will then have some way of tracking that state using an
> identifier (the equivalent of the POSIX open file descriptor). I see
> absolutely nothing in your proposal that will allow me to save the state
> identifier that results from atomic open+create and then propagate it to
> the struct file.

If you read the original patch, ->open_create() has a 'struct file *'
parameter, just like ->open().

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/