Re: [RFC PATCH] VFIO: Add a parameter to force nonthread IRQ

From: Yunhong Jiang
Date: Tue Oct 27 2015 - 02:45:50 EST


On Mon, Oct 26, 2015 at 09:37:14PM -0600, Alex Williamson wrote:
> On Mon, 2015-10-26 at 18:20 -0700, Yunhong Jiang wrote:
> > An option to force VFIO PCI MSI/MSI-X handler as non-threaded IRQ,
> > even when CONFIG_IRQ_FORCED_THREADING=y. This is uselful when
> > assigning a device to a guest with low latency requirement since it
> > reduce the context switch to/from the IRQ thread.
>
> Is there any way we can do this automatically? Perhaps detecting that
> we're on a RT kernel or maybe that the user is running with RT priority?
> I find that module options are mostly misunderstood and misused.

Alex, thanks for review.

It's not easy to detect if the user is running with RT priority, since
sometimes the user start the thread and then set the scheduler priority
late.

Also should we do this only for in kernel irqchip scenario and not for user
space handler, since in kernel irqchip has lower overhead?

>
> > An experiment was conducted on a HSW platform for 1 minutes, with the
> > guest vCPU bound to isolated pCPU. The assigned device triggered the
> > interrupt every 1ms. The average EXTERNAL_INTERRUPT exit handling time
> > is dropped from 5.3us to 2.2us.
> >
> > Another choice is to change VFIO_DEVICE_SET_IRQS ioctl, to apply this
> > option only to specific devices when in kernel irq_chip is enabled. It
> > provides more flexibility but is more complex, not sure if we need go
> > through that way.
>
> Allowing the user to decide whether or not to use a threaded IRQ seems
> like a privilege violation; a chance for the user to game the system and
> give themselves better latency, maybe at the cost of others. I think

Yes, you are right. One benefit of the ioctl change is to have a
per-device-option thus is more flexible.

> we're better off trying to infer the privilege from the task priority or

I'd think system admin may make decision after some tunning, like you said
it "maybe at the cost of others" and not sure if we should make decision
based on task priority or kernel config.

Thanks
--jyh

> kernel config or, if we run out of options, make a module option as you
> have here requiring the system admin to provide the privilege. Thanks,
>
> Alex
>
>
> > Signed-off-by: Yunhong Jiang <yunhong.jiang@xxxxxxxxxxxxxxx>
> > ---
> > drivers/vfio/pci/vfio_pci_intrs.c | 10 +++++++++-
> > 1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
> > index 1f577b4..ca1f95a 100644
> > --- a/drivers/vfio/pci/vfio_pci_intrs.c
> > +++ b/drivers/vfio/pci/vfio_pci_intrs.c
> > @@ -22,9 +22,13 @@
> > #include <linux/vfio.h>
> > #include <linux/wait.h>
> > #include <linux/slab.h>
> > +#include <linux/module.h>
> >
> > #include "vfio_pci_private.h"
> >
> > +static bool nonthread_msi = 1;
> > +module_param(nonthread_msi, bool, 0444);
> > +
> > /*
> > * INTx
> > */
> > @@ -313,6 +317,7 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_device *vdev,
> > char *name = msix ? "vfio-msix" : "vfio-msi";
> > struct eventfd_ctx *trigger;
> > int ret;
> > + unsigned long irqflags = 0;
> >
> > if (vector >= vdev->num_ctx)
> > return -EINVAL;
> > @@ -352,7 +357,10 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_device *vdev,
> > pci_write_msi_msg(irq, &msg);
> > }
> >
> > - ret = request_irq(irq, vfio_msihandler, 0,
> > + if (nonthread_msi)
> > + irqflags = IRQF_NO_THREAD;
> > +
> > + ret = request_irq(irq, vfio_msihandler, irqflags,
> > vdev->ctx[vector].name, trigger);
> > if (ret) {
> > kfree(vdev->ctx[vector].name);
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/