Re: [linux-cifs-client] Re: unlink behavior when file is open byother process
From: Jeff Layton
Date: Fri Oct 17 2008 - 14:10:31 EST
On Fri, 17 Oct 2008 12:41:07 -0500
"Steve French" <smfrench@xxxxxxxxx> wrote:
> On Fri, Oct 17, 2008 at 12:27 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > On Fri, 17 Oct 2008 10:24:29 -0500
> > "Steve French" <smfrench@xxxxxxxxx> wrote:
> >> On Fri, Oct 17, 2008 at 10:09 AM, Steve French <smfrench@xxxxxxxxx> wrote:
> >> > Even when a file is open by another process, posix allows the file to
> >> > be deleted (the file is removed from the namespace and eventually the
> >> > data is removed from disk). Unfortunately due to problems in some
> >> > NAS filers/servers this can be hard to implement, and I am not sure
> >> > what the "best" behavior is in that case.
> > My argument is that the primary function of unlink() is to remove a
> > filename from the filesystem. If we return success from such a call
> > without actually having freed up that filename, then that's a bug. It's
> > unfortunate that some servers don't support the calls we need to make
> > this work all the time.
> The filename will be freed - and the trade off is which breaks fewer apps.
> Both open and unlink man pages list plausible return codes but I
> am worried that the sequence of file operations open/unlink/close
> (I think we see both dbench and connectathon do this IIRC)
> is as common a a sequence as open/unlink/create/close
> thus we could break more apps your way than leaving it as is.
I think we have to shoot for mimicking POSIX as closely as we can. If
we can't do it, I don't think we have any option other than to return
error. We also have to consider that the existing behavior is racy and
unreliable. Suppose we have 2 processes:
Process 1 Process 2
So this will work against a server that doesn't support rename by
filehandle, but what happens if the second create comes in before
process 2 closes the file? The create will fail. The problem is that we
can't predict this. It all depends on the timing.
Someone may QA their application and have it work great, then all of a
sudden when they move to a big load, this sort of thing starts failing
because the timing has changed.
I would *much* prefer to have an application fail reliably than work 90%
of the time, and fail unpredictably the other 10%. Consistency is key.
> > We can't however make assumptions about what applications want. We
> > could, in principle, fix up the situation where a server does
> > open->unlink->create by truncating the old file and pretending that
> > it's a new one.
> That could corrupt data - the original opener may need that data up
> to the moment they close that handle.
Good point, all the more reason not do this.
> > All we can reasonably do is try to have each syscall give us the
> > expected end result or an error if it can't be done.
> The open syscall is allowed to fail with ETXTBUSY (or even access
> denied among other).
> Although this type of situation is not common on open, it more common in
> open (and create) than on unlink and thus A likely that an app could deal with
> an open error than on the unlink that preceeded it. The other argument
> here is that whether or not we allow unlink (when it can be marked for deletion
> but not silly renamed) - we have apps that will get the same error on open due
> to Windows, MacOS and other non-Linux clients setting the flag (ie open/create
> failures for a filename that was marked delete-on-close can still
> happen even if we aren't the
> ones who set the flag on the file since Windows and various other OS can and do
> set this file on the server or remotely)
Sure, I'm not disputing whether returning an error on open is right or
wrong. The problem is that it's not expected. We've just unlinked the
filename and returned success -- there is *no* reason that the create
should fail here. An application programmer will (rightfully) consider
this a bug.
Again, I think we have to try and mimic POSIX to the best of our
ability and just return error on anything else.
Jeff Layton <jlayton@xxxxxxxxxx>
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/