Re: reiser4 plugins

From: David Masover
Date: Thu Jun 23 2005 - 09:27:25 EST

Hash: SHA1

Nikita Danilov wrote:
> David Masover writes:
> [...]
> >
> > What we want is to have programs that can write small changes to one
> > file or to many files, lump all those changes into a transaction, and
> > have the transaction either succeed or fail.
> No existing file system guarantees such behavior. Even atomicity of
> single system call is not guaranteed.

No _existing_ filesystem. But I seem to recall that this was one of the
design decisions of Reiser4, and that the system call itself was pushed
off to 4.1?

Maybe I'm just wrong about how big a transaction can be. Maybe it was
limited to a single file. I don't think so, though. From the
whitepaper: "Stuffing a transaction into a single file just because you
need the transaction to be atomic is hardly what one would call flexible

I also seem to recall that the rolling back of the transaction, should
it fail, was supposed to be handled by the application. This doesn't
quite click with the whitepaper, but it could work.

More whitepaper goodness:

"A new system call sys_reiser4() will be implemented to support
applications that don't have to be fooled into thinking that they are
using POSIX. Through this entry point a richer set of semantics will
access the same files that are also accessible using POSIX calls.
Reiser4() will not implement more than hierarchical names. A full set
theoretic naming system as described on our future vision page will not
be implemented before Reiser6() is implemented (Reiser5 is our
distributed filesystem, Reiser6 is our enhanced semantics, whether we
implement Reiser5 or Reiser6 first depends on which sponsors we find ;-)
). Reiser4() will implement all features necessary to access ACLs as
files/directories rather than as something neither file nor directory.
These include opening and closing transactions, performing a sequence of
I/Os in one system call, and accessing files without use of file
descriptors (necessary for efficient small I/O). Reiser4 will use a
syntax suitable for evolving into Reiser5() syntax with its set
theoretic naming."

So, some sort of transaction is planned.

But, as I said, I wasn't paying enough attention. Maybe there is a
technical reason why this can't be done in Linux?

> > > it doesn't stop the system dead in its tracks waiting for some very long
> > > transaction to finish?
> >
> > We've also discussed this. For one thing, if we can have transactions
> > in databases which don't stop the database dead in its tracks, why can't
> > we do it with filesystems?
> Because to have such transactions databases pay huge price in both
> resource consumption and available concurrency (isolation, commit-time
> locks, etc.), and yet mechanism they use to deal with stuck transactions
> (which is simply to abort it) is not very suitable for the file system.

Oh, really? If we've got application support through sys_reiser4? The
application should be ready to deal with a transaction abort.

I'm still not convinced of any of that paragraph. I don't know enough
to argue the point, but it intuitively feels wrong. After all, if the
metadata is atomic, and we are allowed to make our own system calls, why
can't we make the data atomic?

Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird -

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at