Re: [PATCH v3 2/2] io_uring: add support for IORING_OP_GETDENTS

From: Pavel Begunkov
Date: Fri Feb 19 2021 - 07:14:51 EST


On 19/02/2021 12:05, Pavel Begunkov wrote:
> On 18/02/2021 12:27, Lennert Buytenhek wrote:
>> IORING_OP_GETDENTS behaves much like getdents64(2) and takes the same
>> arguments, but with a small twist: it takes an additional offset
>> argument, and reading from the specified directory starts at the given
>> offset.
>>
>> For the first IORING_OP_GETDENTS call on a directory, the offset
>> parameter can be set to zero, and for subsequent calls, it can be
>> set to the ->d_off field of the last struct linux_dirent64 returned
>> by the previous IORING_OP_GETDENTS call.
>>
>> Internally, if necessary, IORING_OP_GETDENTS will vfs_llseek() to
>> the right directory position before calling vfs_getdents().
>>
>> IORING_OP_GETDENTS may or may not update the specified directory's
>> file offset, and the file offset should not be relied upon having
>> any particular value during or after an IORING_OP_GETDENTS call.
>>
>> Signed-off-by: Lennert Buytenhek <buytenh@xxxxxxxxxxxxxx>
>> ---
>> fs/io_uring.c | 73 +++++++++++++++++++++++++++++++++++
>> include/uapi/linux/io_uring.h | 1 +
>> 2 files changed, 74 insertions(+)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 056bd4c90ade..6853bf48369a 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -635,6 +635,13 @@ struct io_mkdir {
>> struct filename *filename;
>> };
>>
> [...]
>> +static int io_getdents(struct io_kiocb *req, unsigned int issue_flags)
>> +{
>> + struct io_getdents *getdents = &req->getdents;
>> + bool pos_unlock = false;
>> + int ret = 0;
>> +
>> + /* getdents always requires a blocking context */
>> + if (issue_flags & IO_URING_F_NONBLOCK)
>> + return -EAGAIN;
>> +
>> + /* for vfs_llseek and to serialize ->iterate_shared() on this file */
>> + if (file_count(req->file) > 1) {
>
> Looks racy, is it safe? E.g. can be concurrently dupped and used, or just
> several similar IORING_OP_GETDENTS requests.
>
>> + pos_unlock = true;
>> + mutex_lock(&req->file->f_pos_lock);
>> + }
>> +
>> + if (req->file->f_pos != getdents->pos) {
>> + loff_t res = vfs_llseek(req->file, getdents->pos, SEEK_SET);
>
> I may be missing the previous discussions, but can this ever become
> stateless, like passing an offset? Including readdir.c and beyond.

I mean without those seeks. An emulation would look like rewinding
pos back after vfs_getdents, though might be awful on performance.

>
>> + if (res < 0)
>> + ret = res;
>> + }
>> +
>> + if (ret == 0) {
>> + ret = vfs_getdents(req->file, getdents->dirent,
>> + getdents->count);
>> + }
>> +
>> + if (pos_unlock)
>> + mutex_unlock(&req->file->f_pos_lock);
>> +
>> + if (ret < 0) {
>> + if (ret == -ERESTARTSYS)
>> + ret = -EINTR;
>> + req_set_fail_links(req);
>> + }
>> + io_req_complete(req, ret);
>> + return 0;
>> +}
> [...]
>

--
Pavel Begunkov