Re: [RFC PATCH v5 03/11] VFIO_IOMMU_TYPE1 for platform bus devices on ARM

From: Will Deacon
Date: Wed Apr 30 2014 - 09:08:34 EST


On Mon, Apr 28, 2014 at 09:08:10PM +0100, Alex Williamson wrote:
> On Mon, 2014-04-28 at 20:19 +0100, Will Deacon wrote:
> > Please excuse any ignorance on part here (I'm not at all familiar with the
> > Intel IOMMU), but shouldn't this really be a property of the interrupt
> > controller itself? On ARM with GICv3, there is a separate block called the
> > ITS (interrupt translation service) which is part of the interrupt
> > controller. The ITS provides a doorbell page which the SMMU can map into a
> > guest operating system to provide MSI for passthrough devices, but this
> > isn't something the SMMU is aware of -- it will just see the iommu_map
> > request for a non-cacheable mapping.
>
> I don't know the history of why this is an IOMMU domain capability on
> x86, it's sort of a paradox. An MSI from a device is conceptually just
> a DMA write and is therefore logically co-located in the IOMMU hardware,
> but x86 doesn't allow it to be mapped via the IOMMU API interfaces. For
> compatibility, interrupt remapping support is buried deep in the
> request_irq interface and effectively invisible other than having this
> path to query it. Therefore this flag is effectively just saying "MSI
> isolation support is present and enabled". IOW, the host is protected
> from interrupt injection attacks from malicious devices. If there is
> some property of your platform that makes this always the case, then the
> IOMMU driver can always export this capability as true.

Thanks for the explanation. On ARM, the SMMU does indeed see the MSI write
just like a normal write, so it can be mapped via iommu_map() to point at
the interrupt controller doorbell page. I guess that means we can enable
this capability for all MSI-capable devices upstream of the SMMU, providing
that the IRQ controller doesn't have any horrible quirks.

> With PCI, MSI is configured via spec defined configuration space
> registers, so we emulate these registers and prevent user access to them
> so that we don't need to allow the user a way to setup an interrupt
> remapping entry. It's done for them via request_irq.
>
> IIRC, the Freescale devices have a limited number of MSI pages and can
> therefore create some instances with isolation while others may require
> sharing. In that case I would expect this flag to indicate whether the
> domain has an exclusive or shared page.
>
> In any case, I suspect keying on the bus_type here is not the correct
> way to go. Thanks,

Agreed, I was more intrigued by the meaning of the flag.

Thanks,

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/