Crash safety semantics of mmap(), rename(), fsync(), msync()

From: Alex Bligh
Date: Sat Jul 23 2011 - 07:06:36 EST


I have a question about safety of mmap, rename and fync()/msync().

I am creating a memory mapped file with a particular name, and I want to ensure the following in the event of a system crash / power outage / whatever:
A. If the file exists on disk with its final name at any stage, it MUST contain all the data I have written to it (i.e. it must never exist on disk with different data and this name)
B. At the exit from the function concerned, the file must exist with its final name, and must contain that data on disk. The fd must still e open and the mmap in place (it's used to read from elsewhere).

What I am doing to achieve this is:
1. open (O_CREAT) to a temporary file name
2. ftruncate() the file to required length
3. mmap() the file
4. write the data to the mmap'd area
5. msync() the whole area to ensure the data is written
6. fsync() the file to ensure the metadata is written (e.g. the creation of the file in the first place and the extension of the file by ftruncate())
7. rename() the file to the required file name
8. fsync() the file again, to ensure the rename is written to disk to satisfy criteria B above)

This is all quite time critical. I can pretty much choose the fie system but would prefer something like ext4. The file is between 1MB and 8MB in size if that matters any.

The question I have is this: Is it really necessary to msync() and fsync() twice? Can I get away without (e.g.) the stage 6 msync? Or, without it, might a crash immediately after the rename() result in a file that has the permanent name, but the wrong metadata? (I only care about file name and recorded length I think). Or is rename() guaranteed to write out all metadata (e.g. file length), in which case can I drop both fsync()s? Or are the fsync()s guaranteed to be cheap if they do nothing?

--
Alex Bligh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/