Re: ublk-nbd: ublk-nbd is avaialbe

From: Ming Lei
Date: Wed Jan 25 2023 - 22:09:41 EST


Hi Jens,

On Thu, Jan 19, 2023 at 11:49:04AM -0700, Jens Axboe wrote:
> On 1/19/23 7:23 AM, Ming Lei wrote:
> > Hi,
> >
> > ublk-nbd[1] is available now.
> >
> > Basically it is one nbd client, but totally implemented in userspace,
> > and wrt. current nbd-client in [2], the transmission phase is done
> > by linux block nbd driver.
> >
> > The handshake implementation is borrowed from nbd project[2], so
> > basically ublk-nbd just adds new code for implementing transmission
> > phase, and it can be thought as moving linux block nbd driver into
> > userspace.
> >
> > The added new code is basically in nbd/tgt_nbd.cpp, and io handling
> > is based on liburing[3], and implemented by c++20 coroutine, so
> > everything is done in single pthread totally lockless, meantime turns
> > out it is pretty easy to design & implement, attributed to ublk framework,
> > c++20 coroutine and liburing.
> >
> > ublk-nbd supports both tcp and unix socket, and allows to enable io_uring
> > send zero copy via command line '--send_zc', see details in README[4].
> >
> > No regression is found in xfstests by using ublk-nbd as both test device
> > and scratch device, and builtin test(make test T=nbd) runs well.
> >
> > Fio test("make test T=nbd") shows that ublk-nbd performance is
> > basically same with nbd-client/nbd driver when running fio on real
> > ethernet link(1g, 10+g), but ublk-nbd IOPS is higher by ~40% than
> > nbd-client(nbd driver) with 512K BS, which is because linux nbd
> > driver sets max_sectors_kb as 64KB at default.
> >
> > But when running fio over local tcp socket, it is observed in my test
> > machine that ublk-nbd performs better than nbd-client/nbd driver,
> > especially with 2 queue/2 jobs, and the gap could be 10% ~ 30%
> > according to different block size.
>
> This is pretty nice! Just curious, have you tried setting up your
> ring with
>
> p.flags |= IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN;
>
> and see if that yields any extra performance improvements for you?
> Depending on how you do processing, you should not need to do any
> further changes there.
>
> A "lighter" version is just setting IORING_SETUP_COOP_TASKRUN.

IORING_SETUP_COOP_TASKRUN is enabled in current ublksrv.

After disabling COOP_TASKRUN and enabling SINGLE_ISSUER & DEFER_TASKRUN,
not see obvious improvement, meantime regression is observed on 64k
rw.


Thanks,
Ming