Re: [PATCH v6 05/20] vfio: mdev: common lib code for setting up Interrupt Message Store

From: Dave Jiang
Date: Thu May 27 2021 - 21:50:06 EST



On 5/23/2021 5:02 PM, Jason Gunthorpe wrote:
On Fri, May 21, 2021 at 05:19:36PM -0700, Dave Jiang wrote:
Add common helper code to setup IMS once the MSI domain has been
setup by the device driver. The main helper function is
mdev_ims_set_msix_trigger() that is called by the VFIO ioctl
VFIO_DEVICE_SET_IRQS. The function deals with the setup and
teardown of emulated and IMS backed eventfd that gets exported
to the guest kernel via VFIO as MSIX vectors.

Suggested-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
Signed-off-by: Dave Jiang <dave.jiang@xxxxxxxxx>
---
drivers/vfio/mdev/Kconfig | 12 ++
drivers/vfio/mdev/Makefile | 3
drivers/vfio/mdev/mdev_irqs.c | 318 +++++++++++++++++++++++++++++++++++++++++
include/linux/mdev.h | 51 +++++++
4 files changed, 384 insertions(+)
create mode 100644 drivers/vfio/mdev/mdev_irqs.c
IMS is not mdev specific, do not entangle it with mdev code. This
should be generic VFIO stuff.

+static int mdev_msix_set_vector_signal(struct mdev_irq *mdev_irq, int vector, int fd)
+{
+ int rc, irq;
+ struct mdev_device *mdev = irq_to_mdev(mdev_irq);
+ struct mdev_irq_entry *entry;
+ struct device *dev = &mdev->dev;
+ struct eventfd_ctx *trigger;
+ char *name;
+ bool pasid_en;
+ u32 auxval;
+
+ if (vector < 0 || vector >= mdev_irq->num)
+ return -EINVAL;
+
+ entry = &mdev_irq->irq_entries[vector];
+
+ if (entry->ims)
+ irq = dev_msi_irq_vector(dev, entry->ims_id);
+ else
+ irq = 0;
+
+ pasid_en = mdev_irq->pasid != INVALID_IOASID ? true : false;
+
+ /* IMS and invalid pasid is not a valid configuration */
+ if (entry->ims && !pasid_en)
+ return -EINVAL;
+
+ if (entry->trigger) {
+ if (irq) {
+ irq_bypass_unregister_producer(&entry->producer);
+ free_irq(irq, entry->trigger);
+ if (pasid_en) {
+ auxval = ims_ctrl_pasid_aux(0, false);
+ irq_set_auxdata(irq, IMS_AUXDATA_CONTROL_WORD, auxval);
+ }
+ }
+ kfree(entry->name);
+ eventfd_ctx_put(entry->trigger);
+ entry->trigger = NULL;
+ }
+
+ if (fd < 0)
+ return 0;
+
+ name = kasprintf(GFP_KERNEL, "vfio-mdev-irq[%d](%s)", vector, dev_name(dev));
+ if (!name)
+ return -ENOMEM;
+
+ trigger = eventfd_ctx_fdget(fd);
+ if (IS_ERR(trigger)) {
+ kfree(name);
+ return PTR_ERR(trigger);
+ }
+
+ entry->name = name;
+ entry->trigger = trigger;
+
+ if (!irq)
+ return 0;
+
+ if (pasid_en) {
+ auxval = ims_ctrl_pasid_aux(mdev_irq->pasid, true);
+ rc = irq_set_auxdata(irq, IMS_AUXDATA_CONTROL_WORD, auxval);
+ if (rc < 0)
+ goto err;
Why is anything to do with PASID here? Something has gone wrong with
the layers I suspect..

Oh yes. drivers/irqchip/irq-ims-msi.c is dxd specific and shouldn't be
pretending to be common code.

The protocol to stuff the pasid and other stuff into the auxdata is
also compeltely idxd specific and is just a hacky way to communicate
from this code to the IDXD irq-chip.

So this doesn't belong here either. Pass in the auxdata from the idxd
code and I'd rename the irq-ims-msi to irq-ims-idxd

+static int mdev_msix_enable(struct mdev_irq *mdev_irq, int nvec)
+{
+ struct mdev_device *mdev = irq_to_mdev(mdev_irq);
+ struct device *dev;
+ int rc;
+
+ if (nvec != mdev_irq->num)
+ return -EINVAL;
+
+ if (mdev_irq->ims_num) {
+ dev = &mdev->dev;
+ rc = msi_domain_alloc_irqs(dev_get_msi_domain(dev), dev, mdev_irq->ims_num);
Huh? The PCI device should be the only device touching IRQ stuff. I'm
nervous to see you mix in the mdev struct device into this function.

As we talked about in the other thread. We have a single IMS domain per device. The domain is set to the mdev 'struct device' and we allocate the vectors to each mdev 'struct device' so we can manage those IMS vectors specifically for that mdev.


Isn't the msi_domain just idxd->ims_domain?

Yes



Jason