Re: [PATCH v2 1/2] fs,block: Introduce RWF_ZONE_APPEND and handling in direct IO path

From: Kanchan Joshi
Date: Fri Jun 26 2020 - 17:18:19 EST


On Fri, Jun 26, 2020 at 09:58:46AM +0100, Christoph Hellwig wrote:
To restate my previous NAK:

A low-level protocol detail like RWF_ZONE_APPEND has absolutely no
business being exposed in the Linux file system interface.

And as mentioned before I think the idea of returning the actual
position written for O_APPEND writes totally makes sense, and actually
is generalizable to all files. Together with zonefs that gives you a
perfect interface for zone append.

On Thu, Jun 25, 2020 at 10:45:48PM +0530, Kanchan Joshi wrote:
Introduce RWF_ZONE_APPEND flag to represent zone-append.

And no one but us select few even know what zone append is, nevermind
what the detailed semantics are. If you add a userspace API you need
to very clearly document the semantics inluding errors and corner cases.

For block IO path (which is the scope of this patchset) there is no
probelm in using RWF_APPEND for zone-append, because it does not do
anything for block device. We can use that, avoiding introduction of
RWF_ZONE_APPEND in user-space.

In kernel, will it be fine to keep IOCB_ZONE_APPEND apart from
IOCB_APPEND? Reason being, this can help to isolate the code meant only
for zone-append from the one that is already present for conventional
append.

Snippet from quick reference -

static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC);
if (flags & RWF_APPEND)
ki->ki_flags |= IOCB_APPEND;
+ if (flags & RWF_ZONE_APPEND) {
+ /* currently support block device only */
+ umode_t mode = file_inode(ki->ki_filp)->i_mode;
+
+ if (!(S_ISBLK(mode)))
+ return -EOPNOTSUPP;
+ ki->ki_flags |= IOCB_ZONE_APPEND;
+ }


As for file I/O in future, I see a potential problem with RWF_APPEND.
In io_uring, zone-append requires bit of pre/post processing, which
ideally should be done only for zone-append case. A ZoneFS file using
RWF_APPEND as a mean to invoke zone-append vs a regular file (hosted on
some other FS) requiring conventional RWF_APPEND - both will execute
that processing.
Is there a good way to differentiate ZoneFS file from another file which
only wants use conventional file-append?