Re: [PATCH V8 00/33] loop: Issue O_DIRECT aio using bio_vec

From: Ming Lei
Date: Sat Jan 10 2015 - 11:51:13 EST


Hi Guys,

On 1/6/15, Ming Lei <ming.lei@xxxxxxxxxxxxx> wrote:
> On 1/6/15, Maxim Patlasov <mpatlasov@xxxxxxxxxxxxx> wrote:
>> On 12/31/2014 04:52 PM, Ming Lei wrote:
>>> On Thu, Jan 1, 2015 at 6:35 AM, Sedat Dilek <sedat.dilek@xxxxxxxxx>
>>> wrote:
>>>> On Wed, Dec 31, 2014 at 10:52 PM, Dave Kleikamp
>>>> <dave.kleikamp@xxxxxxxxxx> wrote:
>>>>> On 12/31/2014 02:38 PM, Sedat Dilek wrote:
>>>>>> What has happened to that aio_loop patchset?
>>>>>> Is it in Linux-next?
>>>>>> ( /me started to play with "block: loop: convert to blk-mq (v3)", so
>>>>>> I
>>>>>> recalled this other improvement. )
>>>>> It met with some harsh resistance, so I backed off on it. Then Al Viro
>>>>> got busy re-writing the iov_iter infrastructure and I put my patchset
>>>>> on
>>>>> the shelf to look at later. Then Ming Lei submitted more up-to-date
>>>>> patchset: https://lkml.org/lkml/2014/8/6/175
>>>>>
>>>>> It looks like Ming is currently only pushing the first half of that
>>>>> patchset. I don't know what his plans are for the last three patches:
>>>>>
>>>>> aio: add aio_kernel_() interface
>>>>> fd/direct-io: introduce should_dirty for kernel aio
>>>>> block: loop: support to submit I/O via kernel aio based
>>>>>
>>>> I tested with block-mq-v3 (for next-20141231) [1] and this looks
>>>> promising [2].
>>>>
>>>> Maybe Ming can say what the plan is with the missing parts.
>>> I have compared kernel aio based loop-mq(the other 3 aio patches
>>> against loop-mq v2, [1]) with loop-mq v3, looks the data isn't
>>> better than loop-mq v3.
>>>
>>> kernel aio based approach requires direct I/O, at least direct write
>>> shouldn't be good as page cache write, IMO.
>>>
>>> So I think we need to investigate kernel aio based approach further
>>> wrt. loop improvement.
>>
>> A great advantage of kernel aio for loop device is the avoidance of
>> double caching: the data from page cache of inner filesystem (mounted on
>> the loop device) won't be cached again in the page cache of the outer
>> filesystem (one that keeps image file of the loop device).
>
> Yes, I agree avoidance of double cache is very good, at least
> page consumption can be decreased, avoid one copy and make the backed
> file more like a 'block' device.
>
>>
>> So I don't think it's correct to compare the performance of aio based
>> loop-mq with loop-mq v3. Aio based approach is OK as long as it doesn't
>> introduce significant overhead as compared with submitting bio-s
>> straightforwardly from loop device (or any other in-kernel user of
>> kernel aio) to host block device.
>
> One problem is that aio based approach requires O_DIRECT, and direct
> write looks much slower compared with page cache write in my fio
> test over loop block directly.
>
> But it might not be so bad when write I/O is considered from
> filesystem over loop block since there is still page cache and the
> write I/O is often big chunk from file system. I will run tests inside
> filesystem to compare the two approaches further.

I just completed kernel aio based loop patch v2, together fio tests /
sar monitor results.

If fio tests are run inside filesystem(ext4) over loop block, throughput hasn't
big change between kernel aio and no kernel aio, at the same time, context
switches and cpu utilization are decreased a lot with kernel aio, also much less
memory becomes used after completing the tests compared with no kernel aio.

If fio tests are run over loop block directly(no fs mounted),
throughput of read,
randwrite and write are decreased a bit much. At the same time, cpu utilization
and context switches are decreased too with kernel aio in other three tests
except for write, still much less memory becomes used after completing these
tests compared with no kernel aio.

So looks it is better to provide one sys file to control if kernel aio is used.

Before I post the v2 kernel aio based loop patches, I'd like to choose what the
default setting should be: kernel aio or not? It depends on which usage is more
common for loop block users, via file system over loop block or access
loop block
directly?

Thanks,
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/