Re: [PATCH V2 1/1] ubd: add io_uring based userspace block driver

From: Ziyang Zhang
Date: Wed May 18 2022 - 01:53:25 EST


On 2022/5/17 20:55, Ming Lei wrote:
> On Tue, May 17, 2022 at 06:00:57PM +0800, Ziyang Zhang wrote:
>> On 2022/5/17 13:53, Ming Lei wrote:
>>
>>> +
>>> +static void ubd_cancel_queue(struct ubd_queue *ubq)
>>> +{
>>> + int i;
>>> +
>>> + for (i = 0; i < ubq->q_depth; i++) {
>>> + struct ubd_io *io = &ubq->ios[i];
>>> +
>>> + if (io->flags & UBD_IO_FLAG_ACTIVE) {
>>> + io->flags &= ~UBD_IO_FLAG_ACTIVE;
>>> + io_uring_cmd_done(io->cmd, UBD_IO_RES_ABORT, 0);
>>> + }
>>> + }
>>> +}
>>
>> Hi Ming,
>>
>> When ubdsrv sends STOP_DEV and all active IOs in ubd_drv are done(UBD_IO_RES_ABORT),
>> there may be still some IOs handled by ubdsrv(UBD_IO_FLAG_ACTIVE not set).
>> When these IOs complete and return to ubd_drv, how to handle them?
>
> Either UBD_IO_COMMIT_AND_FETCH_REQ or UBD_IO_COMMIT_REQ will be sent to ubd_drv
> for completing these IOs. And finally ubd_cancel_dev() in ubd driver will
> cancel all pending io commands, so io_uring can be exited. I guess
> UBD_IO_COMMIT_REQ can be removed too.

Yes, I think UBD_IO_COMMIT_REQ can be removed.

>
>> I find that UBD_IO_FETCH_REQ are still set,
>> so will these IOs be issued to ubdsrv again or canceled?
>> (I see ubd_drv fails IOs when the daemon is dying
>> but maybe here the daemon is still alive)
>
> If daemon is alive, ubd_drv will rely on ubq_daemon for completing
> all inflight IOs. Otherwise, the monitor work will be triggered for
> completing/failing inflight IOs. The mechanism is actually very simple:
>
> static void ubd_stop_dev(struct ubd_device *ub)
> {
> mutex_lock(&ub->mutex);
> if (!disk_live(ub->ub_disk))
> goto unlock;
>
> del_gendisk(ub->ub_disk); // drain & wait in-flight IOs
> ub->dev_info.state = UBD_S_DEV_DEAD;
> ub->dev_info.ubdsrv_pid = -1;
> ubd_cancel_dev(ub); //No IO is possible now, so cancel pending io commands
> unlock:
> mutex_unlock(&ub->mutex);
> cancel_delayed_work_sync(&ub->monitor_work);
> }
>
> When waiting for IO completion in del_gendisk(), in case that ubq_daemon
> is exiting/dying, monitor work will be triggered to call ubd_abort_queue() to
> fail in-flight requests for making forward progress. ubd_abort_queue() may
> looks a bit tricky to try using task work for aborting request, that
> is just for sync with ubd_rq_task_work_fn().
>

Thanks for explanation because this part really confuses me. :)
But I still concern about the complicity of handling exiting/dying ubq_daemon
and aborting queues and I'm trying to find out a simpler way...

Another question is that using task_work functions require UBD to be built in kernel.
However for users, maybe they are willing to use an external UBD module.
Shall we discuss about this now?

Regards,
Zhang