Re: arm64 iommu groups issue

From: John Garry
Date: Thu Feb 13 2020 - 10:50:08 EST



The underlying issue is that, for historical reasons, OF/IORT-based
IOMMU drivers have ended up with group allocation being tied to endpoint
driver probing via the dma_configure() mechanism (long story short,
driver probe is the only thing which can be delayed in order to wait for
a specific IOMMU instance to be ready).However, in the meantime, the
IOMMU API internals have evolved sufficiently that I think there's a way
to really put things right - I have the spark of an idea which I'll try
to sketch out ASAP...


OK, great.

Hi Robin,

I was wondering if you have had a chance to consider this problem again?

One simple idea could be to introduce a device link between the endpoint device and its parent bridge to ensure that they probe in order, as expected in pci_device_group():

Subject: [PATCH] PCI: Add device link to ensure endpoint device driver probes after bridge

It is required to ensure that a device driver for an endpoint will probe
after the parent port driver, so add a device link for this.

---
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 512cb4312ddd..4b832ad25b20 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2383,6 +2383,7 @@ static void pci_set_msi_domain(struct pci_dev *dev)
void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
{
int ret;
+ struct device *parent;

pci_configure_device(dev);

@@ -2420,6 +2421,10 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
/* Set up MSI IRQ domain */
pci_set_msi_domain(dev);

+ parent = dev->dev.parent;
+ if (parent && parent->bus == &pci_bus_type)
+ device_link_add(&dev->dev, parent, DL_FLAG_AUTOPROBE_CONSUMER);
+
/* Notifier could use PCI capabilities */
dev->match_driver = false;
ret = device_add(&dev->dev);
--

This would work, but the problem is that if the port driver fails in probing - and not just for -EPROBE_DEFER - then the child device will never probe. This very thing happens on my dev board. However we could expand the device links API to cover this sort of scenario.

As for alternatives, it looks pretty difficult to me to disassociate the group allocation from the dma_configure path.

Let me know.

Thanks,
John