Re: [PATCH] irqchip/gic-{v2m, v3-its}: check iommu capable before doing iommu map

From: Marc Zyngier
Date: Thu Aug 17 2017 - 05:23:03 EST


On 17/08/17 10:01, Shawn Lin wrote:
> Hi Marc
>
> On 2017/8/17 16:52, Marc Zyngier wrote:
>> On 17/08/17 09:28, Shawn Lin wrote:
>>> If a PCIe RC use gic-v2m or gic-v3-its as a msi domain but doesn't
>>> have iommu support, we don't need to do iommu_dma_map_msi_msg to
>>> get mapped iommu address as all we need is the physical address.
>>> Otherwise we see the oops shown below as we can't get a iommu_group
>>> for a device without iommu capable.
>>>
>>> Unable to handle kernel NULL pointer dereference at virtual address
>>> 000000d0
>>> [00000000000000d0] user address but active_mm is swapper
>>> Internal error: Oops: 96000004 [#1] PREEMPT SMP
>>> Modules linked in:
>>> CPU: 3 PID: 1 Comm: swapper/0 Not tainted
>>> 4.13.0-rc5-next-20170815-00001-g0744890-dirty #53
>>> Hardware name: Firefly-RK3399 Board (DT)
>>> task: ffff80007bc70000 task.stack: ffff80007bc6c000
>>> PC is at iommu_get_domain_for_dev+0x38/0x50
>>> LR is at iommu_dma_map_msi_msg+0x3c/0x1b8
>>> pc : [<ffff000008569210>] lr : [<ffff00000856c1e4>] pstate:
>>> a0000045
>>>
>>> ...
>>>
>>> [<ffff000008569210>] iommu_get_domain_for_dev+0x38/0x50
>>> [<ffff00000856c1e4>] iommu_dma_map_msi_msg+0x3c/0x1b8
>>> [<ffff0000083d4d8c>] its_irq_compose_msi_msg+0x44/0x50
>>> [<ffff00000811be50>] irq_chip_compose_msi_msg+0x40/0x58
>>> [<ffff000008120e34>] msi_domain_activate+0x1c/0x48
>>> [<ffff00000811d9f8>] __irq_domain_activate_irq+0x40/0x58
>>> [<ffff00000811f724>] irq_domain_activate_irq+0x24/0x40
>>> [<ffff00000812139c>] msi_domain_alloc_irqs+0x104/0x190
>>> [<ffff000008445abc>] pci_msi_setup_msi_irqs+0x3c/0x48
>>> [<ffff00000844658c>] __pci_enable_msi_range+0x21c/0x408
>>> [<ffff0000084468a4>] pci_alloc_irq_vectors_affinity+0x104/0x168
>>> [<ffff00000843cd98>] pcie_port_device_register+0x200/0x488
>>> [<ffff00000843d2ac>] pcie_portdrv_probe+0x34/0xd0
>>> [<ffff00000842e2cc>] local_pci_probe+0x3c/0xb8
>>> [<ffff00000842f5f8>] pci_device_probe+0x138/0x170
>>> [<ffff00000857d964>] driver_probe_device+0x21c/0x2d8
>>> [<ffff00000857db6c>] __device_attach_driver+0x9c/0xf8
>>> [<ffff00000857bbfc>] bus_for_each_drv+0x5c/0x98
>>> [<ffff00000857d61c>] __device_attach+0xc4/0x138
>>> [<ffff00000857d6a0>] device_attach+0x10/0x18
>>> [<ffff00000842395c>] pci_bus_add_device+0x4c/0xa8
>>> [<ffff0000084239fc>] pci_bus_add_devices+0x44/0x90
>>> [<ffff000008451e88>] rockchip_pcie_probe+0xc70/0xcc8
>>> [<ffff00000857f738>] platform_drv_probe+0x58/0xc0
>>> [<ffff00000857d964>] driver_probe_device+0x21c/0x2d8
>>> [<ffff00000857dacc>] __driver_attach+0xac/0xb0
>>> [<ffff00000857bb3c>] bus_for_each_dev+0x64/0xa0
>>> [<ffff00000857d258>] driver_attach+0x20/0x28
>>> [<ffff00000857cd08>] bus_add_driver+0x110/0x230
>>> [<ffff00000857e468>] driver_register+0x60/0xf8
>>> [<ffff00000857f690>] __platform_driver_register+0x40/0x48
>>> [<ffff000008dd6dc0>] rockchip_pcie_driver_init+0x18/0x20
>>> [<ffff000008083970>] do_one_initcall+0xb0/0x120
>>> [<ffff000008db0cfc>] kernel_init_freeable+0x184/0x224
>>> [<ffff00000896fce8>] kernel_init+0x10/0x100
>>> [<ffff000008084ad0>] ret_from_fork+0x10/0x18
>>>
>>> This patch is to fix the problem exposed by the commit mentioned below.
>>> Before this commit, iommu has a work around method to fix this but now
>>> it doesn't. So we could fix this in gic code but maybe still need a fixes
>>> tag here.
>>>
>>> Fixes: 05f80300dc8b ("iommu: Finish making iommu_group support mandatory")
>>> Signed-off-by: Shawn Lin <shawn.lin@xxxxxxxxxxxxxx>
>>
>> That looks pretty horrible. Please see the patch I posted in response to
>> your initial report:
>>
>> http://marc.info/?l=linux-pci&m=150295809430903&w=2
>
> Aha, I saw it but I didn't get your point of "I'm not convinced that
> playing with the group refcount in an irqchip is the right approach..."
>
> So, I'm not sure whether you prefer your patch?
I really dislike both:

- yours: because it reinvent the wheel (parsing DT one more time on each
MSI message creation), and is DT specific while the rest of the code is
completely firmware agnostic (what about ACPI?).

- mine: because it relies on an intimate knowledge of the internals of
the iommu stuff, which is not what was originally intended.

My comment was an invitation to Joerg to speak his mind.

The original intent of iommu_dma_map_msi_msg() was that it could silently
fail without exploding in your face. We can change that, but a minimum of
coordination wouldn't hurt.

My personal preference would be to address this in the iommu layer,
restoring a non-exploding iommu_dma_map_msi_msg():

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 44eca1e3243f..5440eae21bea 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -883,12 +883,18 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
{
struct device *dev = msi_desc_to_dev(irq_get_msi_desc(irq));
- struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+ struct iommu_domain *domain;
+ struct iommu_group *group;
struct iommu_dma_cookie *cookie;
struct iommu_dma_msi_page *msi_page;
phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
unsigned long flags;

+ group = iommu_group_get(dev);
+ if (!group)
+ return;
+ domain = iommu_get_domain_for_dev(dev);
+ iommu_group_put(group);
if (!domain || !domain->iova_cookie)
return;

Comments welcome.

M.
--
Jazz is not dead. It just smells funny...