Re: [PATCH 1/3] nbd: support FLUSH requests

From: Alex Bligh
Date: Wed Feb 13 2013 - 10:55:21 EST


On 13 Feb 2013, at 13:00, Paolo Bonzini wrote:

> But as far as I can test with free servers, the FUA bits have no
> advantage over flush. Also, I wasn't sure if SEND_FUA without
> SEND_FLUSH is valid, and if so how to handle this combination (treat it
> as writethrough and add FUA to all requests? warn and do nothing?).

On the main opensource nbd client, the following applies:

What REQ_FUA does is an fdatasync() after the write. Code extract and
comments below from Christoph Hellwig.

What REQ_FLUSH does is to do an fsync().

The way I read Christoph's comment, provided the linux block layer always
issues a REQ_FLUSH before a REQ_FUA, there is not performance problem.

However, a REQ_FUA is going to do a f(data)?sync AFTER the write, whereas
the preceding REQ_FLUSH is going to an fsync() BEFORE the write. It seems
to me that either the FUA and FLUSH semantics are therefore different
(and we need FUA), or that Christoph's comment is wrong and that you
are guaranteed a REQ_FLUSH *after* the write with REQ_FUA.

Alex Bligh

} else if (fua) {

/* This is where we would do the following
* However, we don't, for the reasons set out below
* by Christoph Hellwig <hch@xxxxxxxxxxxxx>
* fdatasync is equivalent to fsync except that it does not flush
* non-essential metadata (basically just timestamps in practice), but it
* does flush metadata requried to find the data again, e.g. allocation
* information and extent maps. sync_file_range does nothing but flush
* out pagecache content - it means you basically won't get your data
* back in case of a crash if you either:
* a) have a volatile write cache in your disk (e.g. any normal SATA disk)
* b) are using a sparse file on a filesystem
* c) are using a fallocate-preallocated file on a filesystem
* d) use any file on a COW filesystem like btrfs
* e.g. it only does anything useful for you if you do not have a volatile
* write cache, and either use a raw block device node, or just overwrite
* an already fully allocated (and not preallocated) file on a non-COW
* filesystem.
* [ENDS]
* What we should do is open a second FD with O_DSYNC set, then write to
* that when appropriate. However, with a Linux client, every REQ_FUA
* immediately follows a REQ_FLUSH, so fdatasync does not cause performance
* problems.
#if 0
sync_file_range(fhandle, foffset, len,

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at