Re: [REGRESSION] Re: [patch V3 09/33] genirq/msi: Add range checking to msi_insert_desc()

From: Marc Zyngier
Date: Mon Feb 20 2023 - 13:29:43 EST


On Mon, 20 Feb 2023 17:11:23 +0000,
"Russell King (Oracle)" <linux@xxxxxxxxxxxxxxx> wrote:
>
> On Fri, Nov 25, 2022 at 12:25:59AM +0100, Thomas Gleixner wrote:
> > Per device domains provide the real domain size to the core code. This
> > allows range checking on insertion of MSI descriptors and also paves the
> > way for dynamic index allocations which are required e.g. for IMS. This
> > avoids external mechanisms like bitmaps on the device side and just
> > utilizes the core internal MSI descriptor storxe for it.
> >
> > Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>
> Hi Thomas,
>
> This patch appears to cause a regression on Macchiatobin, delaying the
> boot by about ten seconds due to all the warnings the kernel now
> produces.
>
> > @@ -136,11 +149,16 @@ static bool msi_desc_match(struct msi_de
> >
> > static bool msi_ctrl_valid(struct device *dev, struct msi_ctrl *ctrl)
> > {
> > + unsigned int hwsize;
> > +
> > if (WARN_ON_ONCE(ctrl->domid >= MSI_MAX_DEVICE_IRQDOMAINS ||
> > - !dev->msi.data->__domains[ctrl->domid].domain ||
> > - ctrl->first > ctrl->last ||
> > - ctrl->first > MSI_MAX_INDEX ||
> > - ctrl->last > MSI_MAX_INDEX))
> > + !dev->msi.data->__domains[ctrl->domid].domain))
> > + return false;
> > +
> > + hwsize = msi_domain_get_hwsize(dev, ctrl->domid);
>
> This calls msi_get_device_domain() without taking dev->msi.data->mutex,
> resulting in the lockdep_assert_held() firing for what seems to be every
> MSI created by the Armada 8040 ICU driver, which suggests something isn't
> taking the lock as you expect. Please can you take a look and propose a
> patch to fix this regression.

Since you already worked it out, I only had to translate your words
into the patch below, which solves it for me.

Lockdep also reports[1] a possible circular locking dependency between
phy_attach_direct() and rtnetlink_rcv_msg(), which looks interesting.

Thanks,

M.

[1] https://paste.debian.net/1271454/

diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index 783a3e6a0b10..13d96495e6d0 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -1084,10 +1084,13 @@ int msi_domain_populate_irqs(struct irq_domain *domain, struct device *dev,
struct xarray *xa;
int ret, virq;

- if (!msi_ctrl_valid(dev, &ctrl))
- return -EINVAL;
-
msi_lock_descs(dev);
+
+ if (!msi_ctrl_valid(dev, &ctrl)) {
+ ret = -EINVAL;
+ goto unlock;
+ }
+
ret = msi_domain_add_simple_msi_descs(dev, &ctrl);
if (ret)
goto unlock;

--
Without deviation from the norm, progress is not possible.