On Mon, Sep 30, 2013 at 02:20:30PM +0200, Miklos Szeredi wrote:On Sat, Sep 28, 2013 at 11:20 PM, Ric Wheeler <rwheeler@xxxxxxxxxx> wrote:If I were writing an application that required copies to be restartable,
Okay, I'm convinced.I don't see the safety argument very compelling either. There are realThe above has been the case for all enterprise storage arrays ever since
semantic differences, however: ENOSPC on a write to a
(apparentlÃy) already allocated block. That could be a bit unexpected.
Do we
need a fallocate extension to deal with shared blocks?
the invention of snapshots. The NFSv4.2 spec does allow you to set a
per-file attribute that causes the storage server to always preallocate
enough buffers to guarantee that you can rewrite the entire file, however
the fact that we've lived without it for said 20 years leads me to believe
that demand for it is going to be limited. I haven't put it top of the list
of features we care to implement...
Cheers,
Trond
I agree - this has been common behaviour for a very long time in the array
space. Even without an array, this is the same as overwriting a block in
btrfs or any file system with a read-write LVM snapshot.
So I suggest
- mount(..., MNT_REFLINK): *allow* splice to reflink. If this is not
set, fall back to page cache copy.
- splice(... SPLICE_REFLINK): fail non-reflink copy. With this app
can force reflink.
Both are trivial to implement and make sure that no backward
incompatibility surprises happen.
My other worry is about interruptibility/restartability. Ideas?
What happens on splice(from, to, 4G) and it's a non-reflink copy?
Can the page cache copy be made restartable? Or should splice() be
allowed to return a short count? What happens on (non-reflink) remote
copies and huge request sizes?
I'd probably use the largest possible range in the reflink case but
break the copy into smaller chunks in the splice case.
For that reason I don't like the idea of a mount option--the choice is
something that the application probably wants to make (or at least to
know about).
The NFS COPY operation, as specified in current drafts, allows for
asynchronous copies but leaves the state of the file undefined in the
case of an aborted COPY. I worry that agreeing on standard behavior in
the case of an abort might be difficult.
--b.