Re: filesystem transactions API

From: Jan Hudec
Date: Wed Apr 27 2005 - 04:36:17 EST

On Tue, Apr 26, 2005 at 16:24:34 +0100, Jamie Lokier wrote:
> John Stoffel wrote:
> > >>>>> "Jamie" == Jamie Lokier <jamie@xxxxxxxxxxxxx> writes:
> >
> > Jamie> No. A transaction means that _all_ processes will see the
> > Jamie> whole transaction or not.
> >
> > This is really hard. How do you handle the case where process X
> > starts a transaction modifies files a, b & c, but process Y has file b
> > open for writing, and never lets it go? Or the file gets unlinked?
> Then it starts to depend on what kind of transactions you want to
> implement.
> You can say that a transaction isn't allowed when a process has one of
> the files opened for writing. Or you can say a transaction is
> equivalent to calling all of the I/O system calls at once. You can
> also decide if you want the reads and directory lookups performed in
> the transactions to become prerequisites for the transaction
> completing (so it's aborted if another process writes to those file
> regions or changes the directory structure in a way which breaks a
> prerequisite), or if you want those to lock the things which are read
> for the duration of the transaction, or even just ignore reads for
> transaction purposes. Or, you can say that transactions are limited
> to just directory structure, and not file contents (that's good enough
> for package management), or you can say they're limited to just file
> contents (that's good enough for databases and text file edits).
> Etc, etc, quite a lot of semantic choices.

How do we specify which calls belong to a transaction? By some kind of
extra file handle?

I'd think having global per-process transaction is not the best way.
So I think we should have some kind of transaction handle (probably in
the file handle space) and a way to say that a syscall is done within
a transaction. To avoid duplicating all syscalls, we could have
set_active_transaction() operation.

Now I think the criteria for semantics should be serializability. That
would mean, that lookup paths would have to be locked IFF the lookup was
done within the transaction -- but you would be free to open a file
without transaction, then set_active_transaction and write that file.
That way the write would become atomic, but someone else could freely
rename the file from under you.

Note: Editors currently write to a temporary file and rename over the
original (if they have permissions to do it), which is as good
transaction as they need.

> > What about programs that are already open and running?
> >
> > It might be doable in some sense, but I can see that details are
> > really hard to get right. Esp without breaking existing Unix
> > semantics.
> It's even harder without kernel support! :)

If every syscall (touching filesystem) was turned into a transaction of
it's own, it wouldn't break any semantics.

Jan 'Bulb' Hudec <bulb@xxxxxx>

Attachment: signature.asc
Description: Digital signature