Re: [PATCH] nbd: add missed destroy_workqueue when nbd_start_device fails

From: Michael Christie
Date: Sat Jul 25 2020 - 14:49:07 EST




> On Jul 24, 2020, at 9:27 PM, Li Heng <liheng40@xxxxxxxxxx> wrote:
>
> destroy_workqueue() should be called to destroy ndev->tx_wq
> when nbd_start_device init resources fails.
>
> Reported-by: Hulk Robot <hulkci@xxxxxxxxxx>
> Signed-off-by: Li Heng <liheng40@xxxxxxxxxx>
> ---
> drivers/block/nbd.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index ce7e9f22..45e0a9f4 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -4,7 +4,7 @@
> *
> * Note that you can not swap over this thing, yet. Seems to work but
> * deadlocks sometimes - you can not swap over TCP in general.
> - *
> + *
> * Copyright 1997-2000, 2008 Pavel Machek <pavel@xxxxxx>
> * Parts copyright 2001 Steven Whitehouse <steve@xxxxxxxxxxx>
> *
> @@ -1270,6 +1270,7 @@ static int nbd_start_device(struct nbd_device *nbd)
> error = device_create_file(disk_to_dev(nbd->disk), &pid_attr);
> if (error) {
> dev_err(disk_to_dev(nbd->disk), "device_create_file failed!\n");
> + destroy_workqueue(nbd->recv_workq);
> return error;
> }
> set_bit(NBD_RT_HAS_PID_FILE, &config->runtime_flags);
> @@ -1291,6 +1292,7 @@ static int nbd_start_device(struct nbd_device *nbd)
> */
> if (i)
> flush_workqueue(nbd->recv_workq);
> + destroy_workqueue(nbd->recv_workq);
> return -ENOMEM;
> }
> sk_set_memalloc(config->socks[i]->sock->sk);

For the netlink error path, we end up cleaning up everything when nbd_config_put is called in the error path.

Are you seeing an issue with the ioctl interface and this code path? I thought normally if the the NBD_DO_IT ioctl fails, then userspace closes the device and that does the nbd_config_put that will clean this up like is done in the netlink path.

If userspace is not closing the device and is trying to maybe retry the NBD_DO_IT ioctl or reuse the device some other way, then I think you need to also NULL nbd->task_recv, remove pid file, NULL recv_workq after you destroy for the cases nbd_config_put is called right after a failure.