Re: Virtualizing MSI-X on IMS via VFIO

From: Alex Williamson
Date: Wed Jun 23 2021 - 22:48:35 EST


On Thu, 24 Jun 2021 04:20:31 +0200
Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:

> Kevin,
>
> thank you very much for digging into this! You made my day!
>
> On Thu, Jun 24 2021 at 00:00, Kevin Tian wrote:
> >> From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> >> To work with what we've got, the vfio API describes the limitation of
> >> the host interfaces via the VFIO_IRQ_INFO_NORESIZE flag. QEMU then
> >> makes a choice in an attempt to better reflect what we can infer of the
> >> guest programming of the device to incrementally enable vectors. We
> >
> > It's a surprise to me that Qemu even doesn't look at this flag today after
> > searching its code...
>
> Indeed.
>
> git clone https://github.com/qemu/qemu.git
> cd qemu
> git log -p | grep NORESIZE
> + * The NORESIZE flag indicates that the interrupt lines within the index
> +#define VFIO_IRQ_INFO_NORESIZE (1 << 3)
>
> According to the git history of QEMU this was never used at all and I
> don't care about the magic muck which might be in some RHT repository
> which might make use of that.
>
> Find below the proper fix for this nonsense which just wasted everyones
> time. I'll post it officialy with a proper changelog tomorrow unless
> Kevin beats me to it who actually unearthed this and surely earns the
> credit.
>
> Alex, I seriously have to ask what you were trying to tell us about this
> flag and it's great value and the design related to this.
>
> I'm sure you can submit the corresponding fix to qemu yourself.
>
> And once you are back from lala land, can you please explain how
> VFIO/PCI/MSIX is supposed to work in reality?

It's part of the spec, there's never been a case of !NORESIZE, assuming
NORESIZE is the safe behavior. Sorry, there's no smoking gun here, NAK

> ---
> --- a/drivers/gpu/drm/i915/gvt/kvmgt.c
> +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
> @@ -1644,8 +1644,6 @@ static long intel_vgpu_ioctl(struct mdev
> if (info.index == VFIO_PCI_INTX_IRQ_INDEX)
> info.flags |= (VFIO_IRQ_INFO_MASKABLE |
> VFIO_IRQ_INFO_AUTOMASKED);
> - else
> - info.flags |= VFIO_IRQ_INFO_NORESIZE;
>
> return copy_to_user((void __user *)arg, &info, minsz) ?
> -EFAULT : 0;
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -1018,8 +1018,6 @@ static long vfio_pci_ioctl(struct vfio_d
> if (info.index == VFIO_PCI_INTX_IRQ_INDEX)
> info.flags |= (VFIO_IRQ_INFO_MASKABLE |
> VFIO_IRQ_INFO_AUTOMASKED);
> - else
> - info.flags |= VFIO_IRQ_INFO_NORESIZE;
>
> return copy_to_user((void __user *)arg, &info, minsz) ?
> -EFAULT : 0;
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -693,16 +693,6 @@ struct vfio_region_info_cap_nvlink2_lnks
> * automatically masked by VFIO and the user needs to unmask the line
> * to receive new interrupts. This is primarily intended to distinguish
> * level triggered interrupts.
> - *
> - * The NORESIZE flag indicates that the interrupt lines within the index
> - * are setup as a set and new subindexes cannot be enabled without first
> - * disabling the entire index. This is used for interrupts like PCI MSI
> - * and MSI-X where the driver may only use a subset of the available
> - * indexes, but VFIO needs to enable a specific number of vectors
> - * upfront. In the case of MSI-X, where the user can enable MSI-X and
> - * then add and unmask vectors, it's up to userspace to make the decision
> - * whether to allocate the maximum supported number of vectors or tear
> - * down setup and incrementally increase the vectors as each is enabled.
> */
> struct vfio_irq_info {
> __u32 argsz;
> @@ -710,7 +700,6 @@ struct vfio_irq_info {
> #define VFIO_IRQ_INFO_EVENTFD (1 << 0)
> #define VFIO_IRQ_INFO_MASKABLE (1 << 1)
> #define VFIO_IRQ_INFO_AUTOMASKED (1 << 2)
> -#define VFIO_IRQ_INFO_NORESIZE (1 << 3)
> __u32 index; /* IRQ index */
> __u32 count; /* Number of IRQs within this index */
> };
> --- a/samples/vfio-mdev/mtty.c
> +++ b/samples/vfio-mdev/mtty.c
> @@ -1092,9 +1092,6 @@ static int mtty_get_irq_info(struct mdev
> if (irq_info->index == VFIO_PCI_INTX_IRQ_INDEX)
> irq_info->flags |= (VFIO_IRQ_INFO_MASKABLE |
> VFIO_IRQ_INFO_AUTOMASKED);
> - else
> - irq_info->flags |= VFIO_IRQ_INFO_NORESIZE;
> -
> return 0;
> }
>
>