Re: ext3-2.4-0.9.4

From: Matthias Andree (matthias.andree@stud.uni-dortmund.de)
Date: Fri Aug 03 2001 - 22:13:20 EST


On Fri, 03 Aug 2001, Patrick J. LoPresti wrote:

> To fill in more of the table, Qmail does:
>
> fd = open(tmp)
> write(fd)
> fsync(fd)
> link(tmp,final)
> close(fd)

http://cr.yp.to/qmail/faq/reliability.html

> ...and Postfix does:
>
> fd = open(final)
> write(fd)
> (should be an "fsync(fd)" here, but I cannot find it)
> fchmod(fd,+execute)
> fsync(fd)
> close(fd)

> Postfix apparently uses the execute bit to indicate that delivery is
> complete. I am probably misreading the source (version 20010228
> Patchlevel 3), but I do not see any fsync() between the write and the
> fchmod. Surely it is there or this delivery scheme is not reliable on
> any system, since without an intervening fsync() the writes to the
> data and the permissions can happen out of order.

Not really. The error code if fsync() or close failed are propagated
back to the caller who then decides what to do. smtpd.c nukes the file.
postdrop.c/sendmail.c do not, but the pickup daemon will see that the
file had problems on sync and discard it.

I'm asking Wietse off-list how reliable this approach is and will report
back privately. It should be fairly reliable.

> Anyway, it is certainly true that it is largely useless to have
> fsync() commit only one path to a file; many applications expect to be
> able to force a simple link(x,y) to be committed to disk.

BSD FFS + softupdates sync all file names, traversing from the mount
point down to the actual directory entries that need to be synched.

> 1) People disagree about what SuS mandates, but at least a few
> critical developers (e.g., sct) say it definitely does not
> require synchronizing directory entries for fsync().
>
> 2) It would be fairly easy and efficient for fsync() to chase one
> chain of directory entries up to the root, but a lot harder and
> slower to find and commit all of them.

For BSD FFS + softupdates, this is already done.

> 3) Most (?) core developers, including Linus (?), would not object
> to "dirsync" as a mount option and/or directory attribute, but
> somebody has to rise to the occasion and create the patches.
>
> Is this an accurate summary?

It looks so to me. After the MTA behaviour has been dug up, the dirsync
option could be even weaker if fsync() behaved like FFS + softupdates:
sync the directory entries, including those of link and rename, as well.

The only things to consider would be unlink and symlink. symlinks are
tough since you cannot open() them. Not sure about unlink, looks as if
there's really no way apart from fsync(2)ing the directory or sync(2)ing
the world for these two unless there's a dirsync option coming up.

-- 
Matthias Andree
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Aug 07 2001 - 21:00:31 EST