Re: [RFC PATCH 0/1] virtio: false unhandled irqs from vring_interrupt()

From: Stefan Hajnoczi
Date: Tue Aug 24 2021 - 12:36:51 EST


On Tue, Aug 24, 2021 at 07:31:29AM -0400, Michael S. Tsirkin wrote:
> On Tue, Aug 24, 2021 at 11:59:43AM +0100, Stefan Hajnoczi wrote:
> > While investigating an unhandled irq from vring_interrupt() with virtiofs I
> > stumbled onto a possible race that also affects virtio_gpu. This theory is
> > based on code inspection and hopefully you can point out something that makes
> > this a non-issue in practice :).
> >
> > The vring_interrupt() function returns IRQ_NONE when an MSI-X interrupt is
> > taken with no used (completed) buffers in the virtqueue. The kernel disables
> > the irq and the driver is no longer receives irqs when this happens:
> >
> > irq 77: nobody cared (try booting with the "irqpoll" option)
> > ...
> > handlers:
> > [<00000000a40a49bb>] vring_interrupt
> > Disabling IRQ #77
> >
> > Consider the following:
> >
> > 1. An virtiofs irq is handled and the virtio_fs_requests_done_work() work
> > function is scheduled to dequeue used buffers:
> > vring_interrupt() -> virtio_fs_vq_done() -> schedule_work()
> >
> > 2. The device adds more used requests and just before...
> >
> > 3. ...virtio_fs_requests_done_work() empties the virtqueue with
> > virtqueue_get_buf().
> >
> > 4. The device raises the irq and vring_interrupt() is called after
> > virtio_fs_requests_done_work emptied the virtqueue:
> >
> > irqreturn_t vring_interrupt(int irq, void *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > if (!more_used(vq)) {
> > pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > return IRQ_NONE;
> > ^^^^^^^^^^^^^^^^
> >
> > I have included a patch that switches virtiofs from spin_lock() to
> > spin_lock_irqsave() to prevent vring_interrupt() from running while the
> > virtqueue is processed from a work function.
> >
> > virtio_gpu has a similar case where virtio_gpu_dequeue_ctrl_func() and
> > virtio_gpu_dequeue_cursor_func() work functions only use spin_lock().
> > I think this can result in the same false unhandled irq problem as virtiofs.
> >
> > This race condition could in theory affect all drivers. The VIRTIO
> > specification says:
> >
> > Neither of these notification suppression methods are reliable, as they are
> > not synchronized with the device, but they serve as useful optimizations.
> >
> > If virtqueue_disable_cb() is just a hint and might not disable virtqueue irqs
> > then virtio_net and other drivers have a problem because because an irq could
> > be raised while the driver is dequeuing used buffers. I think we haven't seen
> > this because software VIRTIO devices honor virtqueue_disable_cb(). Hardware
> > devices might cache the value and not disable notifications for some time...
> >
> > Have I missed something?
> >
> > The virtiofs patch I attached is being stress tested to see if the unhandled
> > irqs still occur.
> >
> > Stefan Hajnoczi (1):
> > fuse: disable local irqs when processing vq completions
> >
> > fs/fuse/virtio_fs.c | 15 ++++++++++-----
> > 1 file changed, 10 insertions(+), 5 deletions(-)
>
> Fundamentally it is not a problem to have an unhandled IRQ
> once in a while. It's only a problem if this happens time
> after time.
>
>
> Does the kernel you are testing include
> commit 8d622d21d24803408b256d96463eac4574dcf067
> Author: Michael S. Tsirkin <mst@xxxxxxxxxx>
> Date: Tue Apr 13 01:19:16 2021 -0400
>
> virtio: fix up virtio_disable_cb
>
> ?
>
> If not it's worth checking whether the latest kernel still
> has the issue.
>
> Note cherry picking that commit isn't trivial there are
> a bunch of dependencies.
>
> If you want to try on an old kernel, disable event index.

Thanks for pointing out this commit. My kernel does not have it. The
device is using the split vring layout (probably with EVENT_IDX) so
virtqueue_disable_cb() has no effect.

kernel/irq/spurious.c:note_interrupt() disables the irq after 99.9k
unhandled irqs. It's surprising that virtiofs is hitting this limit
through the scenario I described, but maybe.

I'll try different kernel versions and/or disable EVENT_IDX as you
suggested. Thanks!

Stefan

Attachment: signature.asc
Description: PGP signature