Re: [PATCH v3 4/4] virtio_net: disable cb aggressively

From: Jason Wang
Date: Mon Jan 16 2023 - 22:49:28 EST


On Mon, Jan 16, 2023 at 9:41 PM Laurent Vivier <lvivier@xxxxxxxxxx> wrote:
>
> Hi Michael,
>
> On 5/26/21 10:24, Michael S. Tsirkin wrote:
> > There are currently two cases where we poll TX vq not in response to a
> > callback: start xmit and rx napi. We currently do this with callbacks
> > enabled which can cause extra interrupts from the card. Used not to be
> > a big issue as we run with interrupts disabled but that is no longer the
> > case, and in some cases the rate of spurious interrupts is so high
> > linux detects this and actually kills the interrupt.
> >
> > Fix up by disabling the callbacks before polling the tx vq.
> >
> > Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
> > ---
> > drivers/net/virtio_net.c | 16 ++++++++++++----
> > 1 file changed, 12 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index c29f42d1e04f..a83dc038d8af 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -1433,7 +1433,10 @@ static void virtnet_poll_cleantx(struct receive_queue *rq)
> > return;
> >
> > if (__netif_tx_trylock(txq)) {
> > - free_old_xmit_skbs(sq, true);
> > + do {
> > + virtqueue_disable_cb(sq->vq);
> > + free_old_xmit_skbs(sq, true);
> > + } while (unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
> >
> > if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> > netif_tx_wake_queue(txq);
> > @@ -1605,12 +1608,17 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> > struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum);
> > bool kick = !netdev_xmit_more();
> > bool use_napi = sq->napi.weight;
> > + unsigned int bytes = skb->len;
> >
> > /* Free up any pending old buffers before queueing new ones. */
> > - free_old_xmit_skbs(sq, false);
> > + do {
> > + if (use_napi)
> > + virtqueue_disable_cb(sq->vq);
> >
> > - if (use_napi && kick)
> > - virtqueue_enable_cb_delayed(sq->vq);
> > + free_old_xmit_skbs(sq, false);
> > +
> > + } while (use_napi && kick &&
> > + unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
> >
> > /* timestamp packet in software */
> > skb_tx_timestamp(skb);
>
> This patch seems to introduce a problem with QEMU connected to passt using netdev stream
> backend.
>
> When I run an iperf3 test I get after 1 or 2 seconds of test:
>
> [ 254.035559] NETDEV WATCHDOG: ens3 (virtio_net): transmit queue 0 timed out
> ...
> [ 254.060962] virtio_net virtio1 ens3: TX timeout on queue: 0, sq: output.0, vq: 0x1,
> name: output.0, 8856000 usecs ago
> [ 259.155150] virtio_net virtio1 ens3: TX timeout on queue: 0, sq: output.0, vq: 0x1,
> name: output.0, 13951000 usecs ago
>
> In QEMU, I can see in virtio_net_tx_bh() the function virtio_net_flush_tx() has flushed
> all the queue entries and re-enabled the queue notification with
> virtio_queue_set_notification() and tries to flush again the queue and as it is empty it
> does nothing more and then rely on a kernel notification to re-enable the bottom half
> function. As this notification never comes the queue is stuck and kernel add entries but
> QEMU doesn't remove them:
>
> 2812 static void virtio_net_tx_bh(void *opaque)
> 2813 {
> ...
> 2833 ret = virtio_net_flush_tx(q);
>
> -> flush the queue and ret is not an error and not n->tx_burst (that would re-schedule the
> function)
>
> ...
> 2850 virtio_queue_set_notification(q->tx_vq, 1);
>
> -> re-enable the queue notification
>
> 2851 ret = virtio_net_flush_tx(q);
> 2852 if (ret == -EINVAL) {
> 2853 return;
> 2854 } else if (ret > 0) {
> 2855 virtio_queue_set_notification(q->tx_vq, 0);
> 2856 qemu_bh_schedule(q->tx_bh);
> 2857 q->tx_waiting = 1;
> 2858 }
>
> -> ret is 0, exit the function without re-scheduling the function.
> ...
> 2859 }
>
> If I revert this patch in the kernel (a7766ef18b33 ("virtio_net: disable cb
> aggressively")), it works fine.
>
> How to reproduce it:
>
> I start passt (https://passt.top/passt):
>
> passt -f
>
> and then QEMU
>
> qemu-system-x86_64 ... --netdev
> stream,id=netdev0,server=off,addr.type=unix,addr.path=/tmp/passt_1.socket -device
> virtio-net,mac=9a:2b:2c:2d:2e:2f,netdev=netdev0
>
> Host side:
>
> sysctl -w net.core.rmem_max=134217728
> sysctl -w net.core.wmem_max=134217728
> iperf3 -s
>
> Guest side:
>
> sysctl -w net.core.rmem_max=536870912
> sysctl -w net.core.wmem_max=536870912
>
> ip link set dev $DEV mtu 256
> iperf3 -c $HOST -t10 -i0 -Z -P 8 -l 1M --pacing-timer 1000000 -w 4M
>
> Any idea of what is the problem?

This looks similar to what I spot and try to fix in:

[PATCH net V3] virtio-net: correctly enable callback during start_xmit

(I've cced you in this version).

Thanks

>
> Thanks,
> Laurent
>
>