Re: [PATCH v3] virtio_net: avoid using netif_tx_disable() for serializing tx routine

From: ake
Date: Wed Oct 17 2018 - 23:25:21 EST




On 2018/10/18 0:09, Michael S. Tsirkin wrote:
> On Wed, Oct 17, 2018 at 07:44:12PM +0900, Ake Koomsin wrote:
>> Commit 713a98d90c5e ("virtio-net: serialize tx routine during reset")
>> introduces netif_tx_disable() after netif_device_detach() in order to
>> avoid use-after-free of tx queues. However, there are two issues.
>>
>> 1) Its operation is redundant with netif_device_detach() in case the
>> interface is running.
>> 2) In case of the interface is not running before suspending and
>> resuming, the tx does not get resumed by netif_device_attach().
>> This results in losing network connectivity.
>>
>> It is better to use netif_tx_lock_bh()/netif_tx_unlock_bh() instead for
>> serializing tx routine during reset. This also preserves the symmetry
>> of netif_device_detach() and netif_device_attach().
>>
>> Fixes commit 713a98d90c5e ("virtio-net: serialize tx routine during reset")
>> Signed-off-by: Ake Koomsin <ake@xxxxxxxxxx>
>
> Acked-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
>
> Thanks a lot for debugging!
> Seems like stable material to me, right?

Yes. With this patch, we can avoid network connectivity lost
because of tx not get re-enabled under some situation. Plus, it avoids
redundant operation between netif_device_detach() and
netif_tx_disable().

I tested the patch on Linux net-next and QEMU master by suspending/
resuming the virtual machine repeatedly. The network looks no problem
and has no connectivity lost so far. I tested with both user-mode
networking and tap interface.

Best Regards

>> ---
>> drivers/net/virtio_net.c | 5 ++++-
>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index 3f5aa59c37b7..3e2c041d76ac 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -2267,8 +2267,9 @@ static void virtnet_freeze_down(struct virtio_device *vdev)
>> /* Make sure no work handler is accessing the device */
>> flush_work(&vi->config_work);
>>
>> + netif_tx_lock_bh(vi->dev);
>> netif_device_detach(vi->dev);
>> - netif_tx_disable(vi->dev);
>> + netif_tx_unlock_bh(vi->dev);
>> cancel_delayed_work_sync(&vi->refill);
>>
>> if (netif_running(vi->dev)) {
>> @@ -2304,7 +2305,9 @@ static int virtnet_restore_up(struct virtio_device *vdev)
>> }
>> }
>>
>> + netif_tx_lock_bh(vi->dev);
>> netif_device_attach(vi->dev);
>> + netif_tx_unlock_bh(vi->dev);
>> return err;
>> }
>>
>> --
>> 2.19.1