Re: [patch 21/32] NTB/msi: Convert to msi_on_each_desc()

From: Jason Gunthorpe
Date: Fri Dec 10 2021 - 07:39:44 EST


On Fri, Dec 10, 2021 at 07:29:01AM +0000, Tian, Kevin wrote:
> > 5) It's not possible for the kernel to reliably detect whether it is
> > running on bare metal or not. Yes we talked about heuristics, but
> > that's something I really want to avoid.
>
> How would the hypercall mechanism avoid such heuristics?

It is clever, we don't have an vIOMMU that supplies vIR today, so by
definition all guests are excluded and only bare metal works.

> > The charm is that his works for everything from INTx to IMS because all
> > of them go through the same procedure, except that INTx (IO/APIC) does
> > not support the reservation mode dance.

Do we even have vIOAPIC?

> > Thoughts?

It seems reasonable - do you have any idea how this all would work on
ARM too? IMS on baremetal ARM is surely interesting. I assume they
have a similar issue with trapping the MSI

> Then Qemu needs to find out the GSI number for the vIRTE handle.
> Again Qemu doesn't have such information since it doesn't know
> which MSI[-X] entry points to this handle due to no trap.

No this is already going wrong. qemu *cannot* know the MSI information
because there is no MSI information for IMS.

All qemu should get is the origin device information and data about
how the guest wants the interrupt setup.

Forget about guests and all of this complexity, design how to make
VFIO work with IMS in pure userspace like DPDK.

We must have a VFIO ioctl to acquire a addr/data pair and link it to
an event fd.

I'm not sure exactly how this should be done, it is 90% of what IMS
is, except the VFIO irq_chip cannot touch any real HW and certainly
cannot do mask/unmask..

Maybe that is OK now that it requires IR?

Jason