Re: [PATCH v3 4/4] block: loop: support DIO & AIO

From: Ming Lei
Date: Fri May 08 2015 - 00:09:12 EST


Hi Dave,

Thanks for your comment!

On Fri, May 8, 2015 at 6:20 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> On Thu, May 07, 2015 at 08:32:39PM +0800, Ming Lei wrote:
>> On Thu, May 7, 2015 at 3:24 PM, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
>> >> @@ -441,6 +500,12 @@ static void do_loop_switch(struct loop_device *lo, struct switch_request *p)
>> >> mapping->host->i_bdev->bd_block_size : PAGE_SIZE;
>> >> lo->old_gfp_mask = mapping_gfp_mask(mapping);
>> >> mapping_set_gfp_mask(mapping, lo->old_gfp_mask & ~(__GFP_IO|__GFP_FS));
>> >> +
>> >> + lo->support_dio = mapping->a_ops && mapping->a_ops->direct_IO;
>> >> + if (lo->support_dio)
>> >> + lo->use_aio = true;
>> >> + else
>> >> + lo->use_aio = false;
>> >
>> > We need an explicit userspace op-in for this. For one direct I/O can't
>>
>> Actually this patch is one simplified version, and my old version
>> has exported two sysfs files(use_aio, use_dio) which can control
>> if direct IO or AIO is used but only AIO is enabled if DIO is set. Finally
>> I think it isn't necessary because dio/aio works well from the tests,
>> and userspace shouldn't care if it is AIO or not if the performance
>> is good.
>
> Performance won't always be good.
>
> It looks to me that this has an unbound queue depth for AIO. What
> throttles the amount of IO userspace can throw at an aio-enabled
> loop device? If it's unbound, then userspace can throw gigabytes of
> random write at the loop device and rather thanbe throttled at 128
> outstanding IOs, the queue will just keep growing. That will have
> adverse affects on dirty memory throttling, memory reclaim
> algorithms, read and write latency, etc.
>
> I suspect that if we are going to make the loop device use AIO, it
> will needs a proper queue depth limit (i.e.
> /sys/block/loop0/queue/nr_requests) enforced to avoid this sort of
> problem...

Loop has been converted to blk-mq, and the current queue depth is
128, so there isn't the problem you worried about, is there?

>> > handle sub-sector size access and people use the loop device as a
>> > workaround for that.
>>
>> Yes, user can do that, could you explain a bit what the problem is?
>
> I have a 4k sector backing device and a 512 byte sector filesystem
> image. I can't do 512 byte direct IO to the filesystem image, so I
> can't run tools that handle fs images in files using direct Io on
> that file. Create a loop device with the filesystem image, and now I
> can do 512 byte direct IO to the filesystem image, because all that
> direct IO to the filesystem image is now buffered by the loop
> device.
>
> If the loop device does direct io in this situation, the backing
> filesystem rejects direct IO from the loop device because it is not
> sector (4k) sized/aligned. User now swears, shouts and curses you
> from afar.

Yes, it is one problem, but looks it can be addressed by adding the
following in loop_set_fd():

if (inode->i_sb->s_bdev)
blk_queue_logical_block_size(lo->lo_queue,
bdev_io_min(inode->i_sb->s_bdev));

> DIO and AIO behaviour needs to be configurable through losetup, and
> most definitely not the default behaviour.

Could you share if there are other reasons for making it configurable via
losetup suppose the above issues can be fixed?

Thanks,
Ming Lei

>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@xxxxxxxxxxxxx
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/