Re: [PATCH v3 2/4] iommu: Add iommu_aux_at(de)tach_group()

From: Lu Baolu
Date: Sun Aug 02 2020 - 22:02:39 EST


Hi Alex,

On 8/1/20 2:05 AM, Alex Williamson wrote:
On Fri, 31 Jul 2020 13:47:57 +0800
Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> wrote:

Hi Alex,

On 2020/7/31 3:46, Alex Williamson wrote:
On Wed, 29 Jul 2020 23:34:40 +0000
"Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
From: Alex Williamson <alex.williamson@xxxxxxxxxx>
Sent: Thursday, July 30, 2020 4:04 AM

On Thu, 16 Jul 2020 09:07:46 +0800
Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> wrote:
Hi Jacob,

On 7/16/20 12:01 AM, Jacob Pan wrote:
On Wed, 15 Jul 2020 08:47:36 +0800
Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> wrote:
Hi Jacob,

On 7/15/20 12:39 AM, Jacob Pan wrote:
On Tue, 14 Jul 2020 13:57:01 +0800
Lu Baolu<baolu.lu@xxxxxxxxxxxxxxx> wrote:
This adds two new aux-domain APIs for a use case like vfio/mdev
where sub-devices derived from an aux-domain capable device are
created and put in an iommu_group.

/**
* iommu_aux_attach_group - attach an aux-domain to an
iommu_group
which
* contains sub-devices (for example
mdevs) derived
* from @dev.
* @domain: an aux-domain;
* @group: an iommu_group which contains sub-devices derived
from
@dev;
* @dev: the physical device which supports
IOMMU_DEV_FEAT_AUX.
*
* Returns 0 on success, or an error value.
*/
int iommu_aux_attach_group(struct iommu_domain *domain,
struct iommu_group *group,
struct device *dev)

/**
* iommu_aux_detach_group - detach an aux-domain from an
iommu_group *
* @domain: an aux-domain;
* @group: an iommu_group which contains sub-devices derived
from
@dev;
* @dev: the physical device which supports
IOMMU_DEV_FEAT_AUX.
*
* @domain must have been attached to @group via
iommu_aux_attach_group(). */
void iommu_aux_detach_group(struct iommu_domain *domain,
struct iommu_group *group,
struct device *dev)

It also adds a flag in the iommu_group data structure to identify
an iommu_group with aux-domain attached from those normal ones.

Signed-off-by: Lu Baolu<baolu.lu@xxxxxxxxxxxxxxx>
---
drivers/iommu/iommu.c | 58
+++++++++++++++++++++++++++++++++++++++++++
include/linux/iommu.h |
17 +++++++++++++ 2 files changed, 75 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index e1fdd3531d65..cad5a19ebf22 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -45,6 +45,7 @@ struct iommu_group {
struct iommu_domain *default_domain;
struct iommu_domain *domain;
struct list_head entry;
+ unsigned int aux_domain_attached:1;
};

struct group_device {
@@ -2759,6 +2760,63 @@ int iommu_aux_get_pasid(struct
iommu_domain
*domain, struct device *dev) }
EXPORT_SYMBOL_GPL(iommu_aux_get_pasid);

+/**
+ * iommu_aux_attach_group - attach an aux-domain to an
iommu_group
which
+ * contains sub-devices (for example
mdevs) derived
+ * from @dev.
+ * @domain: an aux-domain;
+ * @group: an iommu_group which contains sub-devices derived
from
@dev;
+ * @dev: the physical device which supports
IOMMU_DEV_FEAT_AUX.
+ *
+ * Returns 0 on success, or an error value.
+ */
+int iommu_aux_attach_group(struct iommu_domain *domain,
+ struct iommu_group *group, struct
device *dev) +{
+ int ret = -EBUSY;
+
+ mutex_lock(&group->mutex);
+ if (group->domain)
+ goto out_unlock;
+
Perhaps I missed something but are we assuming only one mdev per
mdev group? That seems to change the logic where vfio does:
iommu_group_for_each_dev()
iommu_aux_attach_device()

It has been changed in PATCH 4/4:

static int vfio_iommu_attach_group(struct vfio_domain *domain,
struct vfio_group *group)
{
if (group->mdev_group)
return iommu_aux_attach_group(domain->domain,
group->iommu_group,
group->iommu_device);
else
return iommu_attach_group(domain->domain,
group->iommu_group);
}

So, for both normal domain and aux-domain, we use the same concept:
attach a domain to a group.
I get that, but don't you have to attach all the devices within the

This iommu_group includes only mediated devices derived from an
IOMMU_DEV_FEAT_AUX-capable device. Different from
iommu_attach_group(),
iommu_aux_attach_group() doesn't need to attach the domain to each
device in group, instead it only needs to attach the domain to the
physical device where the mdev's were created from.
group? Here you see the group already has a domain and exit.

If the (group->domain) has been set, that means a domain has already
attached to the group, so it returns -EBUSY.

I agree with Jacob, singleton groups should not be built into the IOMMU
API, we're not building an interface just for mdevs or current
limitations of mdevs. This also means that setting a flag on the group
and passing a device that's assumed to be common for all devices within
the group, don't really make sense here. Thanks,

Alex

Baolu and I discussed about this assumption before. The assumption is
not based on singleton groups. We do consider multiple mdevs in one
group. But our feeling at the moment is that all mdevs (or other AUX
derivatives) in the same group should come from the same parent
device, thus comes with above design. Does it sound a reasonable
assumption to you?

No, the approach in this series doesn't really make sense to me. We
currently have the following workflow as Baolu notes in the cover
letter:

domain = iommu_domain_alloc(bus);

iommu_group_for_each_dev(group...

iommu_device = mdev-magic()

if (iommu_dev_feature_enabled(iommu_device,
IOMMU_DEV_FEAT_AUX))
iommu_aux_attach_device(domain, iommu_device);

And we want to convert this to a group function, like we have for
non-aux domains:

domain = iommu_domain_alloc(bus);

iommu_device = mdev-magic()

iommu_aux_attach_group(domain, group, iommu_device);

And I think we want to do that largely because iommu_group.domain is
private to iommu.c (therefore vfio code cannot set it), but we need it
set in order for iommu_get_domain_for_dev() to work with a group
attached to an aux domain. Passing an iommu_device avoids the problem
that IOMMU API code doesn't know how to derive an iommu_device for each
device in the group, but while doing so it ignores the fundamental
nature of a group as being a set of one or more devices. Even if we
can make the leap that all devices within the group would use the same
iommu_device, an API that sets and aux domain for a group while
entirely ignoring the devices within the group seems very broken.

Agreed. We couldn't assume that all devices in an iommu group shares a
same iommu_device, especially when it comes to PF/VF wrapped mediated
device case.


So, barring adding an abstraction at struct device where an IOMMU API
could retrieve the iommu_device backing anther device (which seems a
very abstract concept for the base class), why not have the caller
provide a lookup function? Ex:

int iommu_aux_attach_group(struct iommu_domain *domain,
struct iommu_group *group,
struct device *(*iommu_device_lookup)(
struct device *dev));

Thus vfio could could simply provide &vfio_mdev_get_iommu_device and
we'd have equivalent functionality to what we have currently, but with
the domain pointer set in the iommu_group.

This looks good to me.


This also however highlights that our VF backed mdevs will have the
same issue, so maybe this new IOMMU API interface should mimic
vfio_mdev_attach_domain() more directly, testing whether the resulting
device supports IOMMU_DEV_FEAT_AUX and using an aux vs non-aux attach.
I'm not sure what the name of this combined function should be,
iommu_attach_group_with_lookup()? This could be the core
implementation of iommu_attach_group() where the existing function
simply wraps the call with a NULL function pointer.

Anyway, I think there are ways to implement this that are more in line
with the spirit of groups.

Another possible implementation, just for discussion purpose:

1. Add a member in group_device to save the iommu_device if it exists:

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index b6858adc4f17..6474e82cf4b4 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -47,9 +47,16 @@ struct iommu_group {
struct list_head entry;
};

+/*
+ * dma_alias: The device put in this group might depends on another
+ * physical device to do the DMA remapping. At(de)taching
+ * the domain to/from @dma_alias instead of @dev if
+ * @dma_alias is set.
+ */
struct group_device {
struct list_head list;
struct device *dev;
+ struct device *dma_alias;
char *name;
};

2. Pass in the iommu_device when calling iommu_group_add_device().

int iommu_group_add_device(struct iommu_group *group,
struct device *dev,
struct device *dma_alias)

Hence, the iommu core could get a chance to set the iommu_device in the
group device.

3. Mimic vfio_mdev_attach_domain() logic in iommu_group_do_attach_device():

if (group->dma_alias) {
if (iommu_dev_feature_enabled(group->dma_alias, IOMMU_DEV_FEAT_AUX))
iommu_aux_attach_device(domain, group->dma_alias);
else
__iommu_attach_device(domain, group->dma_alias);
} else {
__iommu_attach_device(domain, dev);
}

One limitation is that the driver should call mdev_set_iommu_device()
before the mdev_probe() get called.

That's an option, but DMA aliases are an existing thing within our
IOMMU/PCI constructs, so I'd steer away from "dma_alias" terminology. I
thought the callback was a little less invasive to the IOMMU layer for
now as aux domains are still a rather unique use case, and I'm not sure
we can justify the hack of otherwise IOMMU backed mdevs formally within
the IOMMU API.

Fair enough. I will start with lookup callback solution.

Best regards,
baolu