Re: [PATCH v4 09/19] irqdomain: Fix mapping-creation race

From: Johan Hovold
Date: Thu Feb 09 2023 - 08:07:31 EST


On Wed, Jan 18, 2023 at 10:40:57AM +0100, Johan Hovold wrote:
> On Tue, Jan 17, 2023 at 10:39:51PM +0100, Thomas Gleixner wrote:
> > On Mon, Jan 16 2023 at 14:50, Johan Hovold wrote:

> > > @@ -802,6 +811,8 @@ unsigned int irq_create_fwspec_mapping(struct irq_fwspec *fwspec)
> > > if (WARN_ON(type & ~IRQ_TYPE_SENSE_MASK))
> > > type &= IRQ_TYPE_SENSE_MASK;
> > >
> > > + mutex_lock(&irq_domain_mutex);
> > > +
> > > /*
> > > * If we've already configured this interrupt,
> > > * don't do it again, or hell will break loose.
> > > @@ -814,7 +825,7 @@ unsigned int irq_create_fwspec_mapping(struct irq_fwspec *fwspec)
> > > * interrupt number.
> > > */
> > > if (type == IRQ_TYPE_NONE || type == irq_get_trigger_type(virq))
> > > - return virq;
> > > + goto out;
> > >
> > > /*
> > > * If the trigger type has not been set yet, then set
> > > @@ -823,36 +834,43 @@ unsigned int irq_create_fwspec_mapping(struct irq_fwspec *fwspec)
> > > if (irq_get_trigger_type(virq) == IRQ_TYPE_NONE) {
> > > irq_data = irq_get_irq_data(virq);
> > > if (!irq_data)
> > > - return 0;
> > > + goto err;
> > >
> > > irqd_set_trigger_type(irq_data, type);
> > > - return virq;
> > > + goto out;
> > > }
> > >
> > > pr_warn("type mismatch, failed to map hwirq-%lu for %s!\n",
> > > hwirq, of_node_full_name(to_of_node(fwspec->fwnode)));
> > > - return 0;
> > > + goto err;
> > > }
> > >
> > > if (irq_domain_is_hierarchy(domain)) {
> > > - virq = irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE, fwspec);
> > > + virq = ___irq_domain_alloc_irqs(domain, -1, 1, NUMA_NO_NODE,
> > > + fwspec, false, NULL);
> > > if (virq <= 0)
> > > - return 0;
> > > + goto err;
> > > } else {
> > > /* Create mapping */
> > > virq = __irq_create_mapping_affinity(domain, hwirq, NULL);
> > > if (!virq)
> > > - return virq;
> > > + goto err;
> > > }
> > >
> > > irq_data = irq_get_irq_data(virq);
> > > if (WARN_ON(!irq_data))
> > > - return 0;
> > > + goto err;
> > >
> > > /* Store trigger type */
> > > irqd_set_trigger_type(irq_data, type);
> > > +out:
> > > + mutex_unlock(&irq_domain_mutex);
> > >
> > > return virq;
> > > +err:
> > > + mutex_unlock(&irq_domain_mutex);
> > > +
> > > + return 0;
> > > }
> > > EXPORT_SYMBOL_GPL(irq_create_fwspec_mapping);
> >
> > You can spare that goto churn by renaming the existing function to
> > irq_create_fwspec_mapping_locked() and invoked that guarded by the
> > mutex, no?
>
> That may be possible, but I'm not sure it's better. It wasn't really
> obvious which of the above returns where error paths and which were
> success paths before this change.
>
> I'll try and see what it would look like.

I gave it a try but the resulting diff is even bigger and also obscures
the reason for why the locking is there (i.e. that the lookup and and
creation needs to be done atomically).

Adding unlocked domain lookup helpers as a preparatory patch could
possible help somewhat with the diff size, but that doesn't really make
sense with the per-domain locking added by the final patch.

(I've made some experiments to quantify the parallel probe speed up that
using per-domain locks can result in so I'm including that for v5.)

Johan