Re: Plan for /dev/ioasid RFC v2

From: Lu Baolu
Date: Thu Jun 10 2021 - 01:51:55 EST


On 6/9/21 8:39 PM, Jason Gunthorpe wrote:
On Wed, Jun 09, 2021 at 02:24:03PM +0200, Joerg Roedel wrote:
On Mon, Jun 07, 2021 at 02:58:18AM +0000, Tian, Kevin wrote:
- Device-centric (Jason) vs. group-centric (David) uAPI. David is not fully
convinced yet. Based on discussion v2 will continue to have ioasid uAPI
being device-centric (but it's fine for vfio to be group-centric). A new
section will be added to elaborate this part;
I would vote for group-centric here. Or do the reasons for which VFIO is
group-centric not apply to IOASID? If so, why?
VFIO being group centric has made it very ugly/difficult to inject
device driver specific knowledge into the scheme.

The device driver is the only thing that knows to ask:
- I need a SW table for this ioasid because I am like a mdev
- I will issue TLPs with PASID
- I need a IOASID linked to a PASID
- I am a devices that uses ENQCMD and vPASID
- etc in future

The current approach has the group try to guess the device driver
intention in the vfio type 1 code.

I want to see this be clean and have the device driver directly tell
the iommu layer what kind of DMA it plans to do, and thus how it needs
the IOMMU and IOASID configured.

This is the source of the ugly symbol_get and the very, very hacky 'if
you are a mdev*and* a iommu then you must want a single PASID' stuff
in type1.

The group is causing all this mess because the group knows nothing
about what the device drivers contained in the group actually want.

Further being group centric eliminates the possibility of working in
cases like !ACS. How do I use PASID functionality of a device behind a
!ACS switch if the uAPI forces all IOASID's to be linked to a group,
not a device?

Device centric with an report that "all devices in the group must use
the same IOASID" covers all the new functionality, keep the old, and
has a better chance to keep going as a uAPI into the future.

The iommu_group can guarantee the isolation among different physical
devices (represented by RIDs). But when it comes to sub-devices (ex. mdev or vDPA devices represented by RID + SSID), we have to rely on the
device driver for isolation. The devices which are able to generate sub-
devices should either use their own on-device mechanisms or use the
platform features like Intel Scalable IOV to isolate the sub-devices.

Under above conditions, different sub-device from a same RID device
could be able to use different IOASID. This seems to means that we can't
support mixed mode where, for example, two RIDs share an iommu_group and
one (or both) of them have sub-devices.

AIUI, when we attach a "RID + SSID" to an IOASID, we should require that
the RID doesn't share the iommu_group with any other RID.

Best regards,
baolu