On 7/2/25 8:02 AM, Ming Lei wrote:
On Wed, Jul 02, 2025 at 09:12:09AM +0800, Yu Kuai wrote:Yeah I think retry should be a safe bet here.
Hi,
在 2025/07/01 21:28, Nilay Shroff 写道:
On 6/28/25 6:18 AM, Yu Kuai wrote:
Hi,
在 2025/06/27 19:04, Ming Lei 写道:
I guess the patch in the following link may be simper, both two take
similar approach:
https://lore.kernel.org/linux-block/aFjbavzLAFO0Q7n1@fedora/
I this the above approach has concurrent problems if nbd_start_device
concurrent with nbd_start_device:
t1:
nbd_start_device
lock
num_connections = 1
unlock
t2:
nbd_add_socket
lock
config->num_connections++
unlock
t3:
nbd_start_device
lock
num_connections = 2
unlock
blk_mq_update_nr_hw_queues
blk_mq_update_nr_hw_queues
//nr_hw_queues updated to 1 before failure
return -EINVAL
In the above case, yes I see that t1 would return -EINVAL (as
config->num_connections doesn't match with num_connections)
but then t3 would succeed to update nr_hw_queue (as both
config->num_connections and num_connections set to 2 this
time). Isn't it? If yes, then the above patch (from Ming)
seems good.
Emm, I'm confused, If you agree with the concurrent process, then
t3 update nr_hw_queues to 2 first and return sucess, later t1 update
nr_hw_queues back to 1 and return failure.
It should be easy to avoid failure by simple retrying.
On another note, synchronizing nbd_start_device and nbd_add_socket
using nbd->task_setup looks more complex and rather we may use
nbd->pid to synchronize both. We need to move setting of nbd->pid
before we invoke blk_mq_update_nr_hw_queues in nbd_start_device.
Then in nbd_add_socket we can evaluate nbd->pid and if it's
non-NULL then we could assume that either nr_hw_queues update is in
progress or device has been setup and so return -EBUSY. I think
anyways updating number of connections once device is configured
would not be possible, so once nbd_start_device is initiated, we
shall prevent user adding more connections. If we follow this
approach then IMO we don't need to add retry discussed above.
Thanks,
--Nilay
.