Re: Removal of bus->msi assignment breaks MSI with stacked domains

From: Thomas Gleixner
Date: Thu Nov 20 2014 - 18:10:29 EST


On Thu, 20 Nov 2014, Bjorn Helgaas wrote:
> On Thu, Nov 20, 2014 at 04:31:45PM +0000, Marc Zyngier wrote:
> > Bjorn, Yijing,
> >
> > I've just realized that patch c167caf8d174 (PCI/MSI: Remove useless
> > bus->msi assignment) completely breaks MSI on arm64 when using the new
> > MSI stacked domain:
> >
> > This patch relies on architectures to implement either
> > pcibios_msi_controller() or arch_setup_msi_irq(). It turns out that with
> > stacked domains, none of this is actually necessary, as long as you can
> > access to the msi_controller.
> >
> > And everything was fine until this patch came around (and managed to
> > test on a system where the PCI devices are not directly attached to the
> > root bus). Of course, everything now breaks, as we cannot get to the MSI
> > controller (which contains the domain we allocate the MSIs from).
> >
> > In short, this patch breaks an important feature on which arm64 relies,
> > and I believe this patch should be reverted ASAP.
>
> I'm happy to revert it from pci/msi, but I think Thomas has already pulled
> it into his branch, so he'd have to drop it, too.
>
> Thomas, let me know if you want to do that. I suppose we could add a new
> patch to add it back, but that would leave bisection broken for the
> interval between c167caf8d174 and the patch that adds it back.

Fortunately my irq/irqdomain branch is not immutable yet. So we have
no problem at that point. I can rebase on your branch until tomorrow
night. Or just rebase on mainline and we sort out the merge conflicts
later, i.e. delegate them to Linus so his job of pulling stuff gets
not completely boring.

What I'm more worried about is whether this intended change is going
to inflict a problem on Jiangs intention to deduce the MSI irq domain
from the device, which we really need for making DMAR work w/o going
through loops and hoops.

I have limited knowledge about the actual scope of iommu (DMAR) units
versus device/bus/host-controllers, so I would appreciate a proper
explanation for that from you or Jiang or both.

My guts feeling tells me that anything less granular than the bus
level is wrong and according to my limited knowledge Intel even has
DMARs which are assigned to a single device it's even more wrong. So
the proper change would be not to push it from bus to something above
the bus, but instead make it a per device property.

But my knowledge there is limited, so I rely on the PCI/architecture
experts to sort that out.

Let me know ASAP.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/