[PATCH v4 0/6] zone-append support in io-uring and aio

From: Kanchan Joshi
Date: Fri Jul 24 2020 - 12:22:41 EST


Changes since v3:
- Return absolute append offset in bytes, in both io_uring and aio
- Repurpose cqe's res/flags and introduce res64 to send 64bit append-offset
- Change iov_iter_truncate to report whether it actually truncated
- Prevent short write and return failure if zone-append is spanning
beyond end-of-device
- Change ki_complete(...,long ret2) interface to support 64bit ret2

v3: https://lore.kernel.org/lkml/1593974870-18919-1-git-send-email-joshi.k@xxxxxxxxxxx/

Changes since v2:
- Use file append infra (O_APPEND/RWF_APPEND) to trigger zone-append
(Christoph, Wilcox)
- Added Block I/O path changes (Damien). Avoided append split into multi-bio.
- Added patch to extend zone-append in block-layer to support bvec iov_iter.
Append using io-uring fixed-buffer is enabled with this.
- Made io-uring support code more concise, added changes mentioned by Pavel.

v2: https://lore.kernel.org/io-uring/1593105349-19270-1-git-send-email-joshi.k@xxxxxxxxxxx/

Changes since v1:
- No new opcodes in uring or aio. Use RWF_ZONE_APPEND flag instead.
- linux-aio changes vanish because of no new opcode
- Fixed the overflow and other issues mentioned by Damien
- Simplified uring support code, fixed the issues mentioned by Pavel
- Added error checks for io-uring fixed-buffer and sync kiocb

v1: https://lore.kernel.org/io-uring/1592414619-5646-1-git-send-email-joshi.k@xxxxxxxxxxx/

Cover letter (updated):

This patchset enables zone-append using io-uring/linux-aio, on block IO path.
Purpose is to provide zone-append consumption ability to applications which are
using zoned-block-device directly.
Application can send write with existing O/RWF_APPEND;On a zoned-block-device
this will trigger zone-append. On regular block device, existing file-append
behavior is retained. However, infra allows zone-append to be triggered on
any file if FMODE_ZONE_APPEND (new kernel-only fmode) is set during open.

With zone-append, written-location within zone is known only after completion.
So apart from the usual return value of write, additional means are
needed to obtain the actual written-location.

In aio, 64bit append-offset is returned to application using res2
field of io_event -

struct io_event {
__u64 data; /* the data field from the iocb */
__u64 obj; /* what iocb this event came from */
__s64 res; /* result code for this event */
__s64 res2; /* secondary result */
};

In io-uring, [cqe->res, cqq->flags] repurposed into res64 to return
64bit append-offset to user-space.

struct io_uring_cqe {
__u64 user_data; /* sqe->data submission passed back */
union {
struct {
__s32 res; /* result code for this event */
__u32 flags;
};
__s64 res64; /* appending offset for zone append */
};
};
Zone-append write is ensured not to be a short-write.

Kanchan Joshi (3):
fs: introduce FMODE_ZONE_APPEND and IOCB_ZONE_APPEND
block: add zone append handling for direct I/O path
block: enable zone-append for iov_iter of bvec type

SelvaKumar S (3):
fs: change ki_complete interface to support 64bit ret2
uio: return status with iov truncation
io_uring: add support for zone-append

block/bio.c | 31 ++++++++++++++++++++---
drivers/block/loop.c | 2 +-
drivers/nvme/target/io-cmd-file.c | 2 +-
drivers/target/target_core_file.c | 2 +-
fs/aio.c | 2 +-
fs/block_dev.c | 51 +++++++++++++++++++++++++++++--------
fs/io_uring.c | 53 +++++++++++++++++++++++++++++++--------
fs/overlayfs/file.c | 2 +-
include/linux/fs.h | 16 +++++++++---
include/linux/uio.h | 7 ++++--
include/uapi/linux/io_uring.h | 9 +++++--
11 files changed, 142 insertions(+), 35 deletions(-)

--
2.7.4