Re: PCI MSI issue with reinserting a driver

From: Marc Zyngier
Date: Mon Feb 01 2021 - 13:51:20 EST


Hi John,

On Mon, 01 Feb 2021 18:34:59 +0000,
John Garry <john.garry@xxxxxxxxxx> wrote:
>
> Just a heads-up, by chance I noticed that I can't re-insert a specific
> driver on v5.11-rc6:
>
> [ 64.356023] hisi_dma 0000:7b:00.0: Adding to iommu group 31
> [ 64.368627] hisi_dma 0000:7b:00.0: enabling device (0000 -> 0002)
> [ 64.384156] hisi_dma 0000:7b:00.0: Failed to allocate MSI vectors!
> [ 64.397180] hisi_dma: probe of 0000:7b:00.0 failed with error -28
>
> That's with CONFIG_DEBUG_TEST_DRIVER_REMOVE=y
>
> Bisect tells me that this is the first bad commit:
> 4615fbc3788d genirq/irqdomain: Don't try to free an interrupt that has
> no mapping
>
> The relevant driver code is
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/dma/hisi_dma.c#n547
>
> That driver only allocates 30 MSI, so maybe there's a problem with not
> allocating (and freeing) all 32 MSI.

Are they Multi-MSI (and not MSI-X)?

> I'll have a bit more of a look tomorrow.

Here's my suspicion: two of the interrupts are mapped in the low-level
domain (the ITS, I'd expect in your case), but they have never been
mapped at the higher level.

On teardown, we only get rid of the 30 that were actually mapped, and
leave the last two dangling in the ITS domain, and thus the ITS device
resources are never freed. On reload, we request another 32
interrupts, which can't be satisfied for this device.

Assuming I got it right, the question is: why weren't these interrupts
mapped in the PCI domain the first place. And if I got it wrong, I'm
even more curious!

Thanks,

M.

--
Without deviation from the norm, progress is not possible.