Re: [POC RFC 0/3] splice(2) support for io_uring

From: Jens Axboe
Date: Tue Jan 21 2020 - 22:30:24 EST


On 1/21/20 8:11 PM, Pavel Begunkov wrote:
> On 22/01/2020 04:55, Jens Axboe wrote:
>> On 1/21/20 5:05 PM, Pavel Begunkov wrote:
>>> It works well for basic cases, but there is still work to be done. E.g.
>>> it misses @hash_reg_file checks for the second (output) file. Anyway,
>>> there are some questions I want to discuss:
>>>
>>> - why sqe->len is __u32? Splice uses size_t, and I think it's better
>>> to have something wider (e.g. u64) for fututre use. That's the story
>>> behind added sqe->splice_len.
>>
>> IO operations in Linux generally are INT_MAX, so the u32 is plenty big.
>> That's why I chose it. For this specifically, if you look at splice:
>>
>> if (unlikely(len > MAX_RW_COUNT))
>> len = MAX_RW_COUNT;
>>
>> so anything larger is truncated anyway.
>
> Yeah, I saw this one, but that was rather an argument for the future.
> It's pretty easy to transfer more than 4GB with sg list, but that
> would be the case for splice.

I don't see this changing, ever, basically. And probably not a big deal,
if you want to do more than 2GB worth of IO, you simply splice them over
multiple commands. At those sizes, the overhead there is negligible.

>>> - it requires 2 fds, and it's painful. Currently file managing is done
>>> by common path (e.g. io_req_set_file(), __io_req_aux_free()). I'm
>>> thinking to make each opcode function handle file grabbing/putting
>>> themself with some helpers, as it's done in the patch for splice's
>>> out-file.
>>> 1. Opcode handler knows, whether it have/needs a file, and thus
>>> doesn't need extra checks done in common path.
>>> 2. It will be more consistent with splice.
>>> Objections? Ideas?
>>
>> Sounds reasonable to me, but always easier to judge in patch form :-)
>>
>>> - do we need offset pointers with fallback to file->f_pos? Or is it
>>> enough to have offset value. Jens, I remember you added the first
>>> option somewhere, could you tell the reasoning?
>>
>> I recently added support for -1/cur position, which splice also uses. So
>> you should be fine with that.
>>
>
> I always have been thinking about it as a legacy from old days, and
> one of the problems of posix. It's not hard to count it in the
> userspace especially in C++ or high-level languages, and is just
> another obstacle for having a performant API. So, I'd rather get rid
> of it here. But is there any reasons against?

It's not always trivial to do in libraries, or programming languages
even. That's why it exists. I would not expect anyone to use it outside
of that.

--
Jens Axboe