Re: MSI irqchip configured as IRQCHIP_ONESHOT_SAFE causes spurious IRQs

From: Ramon Fried
Date: Fri Jan 17 2020 - 10:43:23 EST


On Fri, Jan 17, 2020 at 4:38 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> Ramon,
>
> Ramon Fried <rfried.dev@xxxxxxxxx> writes:
> >> So from a software perspective you want to do something like this:
> >>
> >> gic_irq_handler()
> >> {
> >> mask_ack(gic_irqX);
> >>
> >> pending = read(msi_status);
> >> for_each_bit(bit, pending) {
> >> ack(msi_status, bit); // This clears the latch in the MSI block
> >> handle_irq(irqof(bit));
> >> }
> >> unmask(gic_irqX);
> >> }
> >>
> >> And that works perfectly correct without masking the MSI interrupt at
> >> the PCI level for a threaded handler simply because the PCI device
> >> will not send another interrupt until the previous one has been
> >> handled by the driver unless the PCI device is broken.
> >
> > I'm missing something here, isn't this implementation blocks IRQ's only during
> > the HW handler and not during the threaded handler ? (Assuming that I selected
> > handle_level_irq() as the default handler)
>
> handle_level_irq() is the proper handler for the actual GIC interrupt
> which does the demultiplexing. The MSI interrupts want to have
> handle_edge_irq().
>
> > Actually my implementation current implementation is very similar to what
> > you just described:
> >
> > static void eq_msi_isr(struct irq_desc *desc)
> > {
> > struct irq_chip *chip = irq_desc_get_chip(desc);
> > struct eq_msi *msi;
> > u16 status;
> > unsigned long bitmap;
> > u32 bit;
> > u32 virq;
> >
> > chained_irq_enter(chip, desc);
> > msi = irq_desc_get_handler_data(desc);
> >
> > while ((status = readw(msi->gcsr_regs_base + LINK_GCSR5_OFFSET)
> > & MSI_IRQ_REQ) != 0) {
> > pr_debug("MSI: %x\n", status >> 12);
> >
> > bitmap = status >> 12;
> > for_each_set_bit(bit, &bitmap, msi->num_of_vectors) {
> > virq = irq_find_mapping(msi->inner_domain, bit);
> > if (virq) {
> > generic_handle_irq(virq);
> > } else {
> > pr_err("unexpected MSI\n");
> > handle_bad_irq(desc);
>
> Now if you look at the example I gave you there is a subtle difference:
>
> >> pending = read(msi_status);
> >> for_each_bit(bit, pending) {
> >> ack(msi_status, bit); // This clears the latch in the MSI block
> >> handle_irq(irqof(bit));
> >> }
>
> And this clearing is important when one of the MSI interrupts is
> actually having a threaded handler.
>
> MSI interrupt fires
> -> sets bit in msi_status
> -> MSI block raises GIC interrupt because msi_status != 0
>
> CPU handles GIC interrupt
> pending = read(msi_status);
> for_each_bit(bit, pending)
> handle_irq()
> primary_handler()
> -> WAKEUP_THREAD
>
> RETI, but msi_status is still != 0
>
> > Additionally the domain allocation is defined like:
> > static int eq_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
> > unsigned int nr_irqs, void *args)
> > {
> > struct eq_msi *msi = domain->host_data;
> > unsigned long bit;
> > u32 mask;
> >
> > /* We only allow 32 MSI per device */
> > WARN_ON(nr_irqs > 32);
> > if (nr_irqs > 32)
> > return -ENOSPC;
> >
> > bit = find_first_zero_bit(msi->used, msi->num_of_vectors);
> > if (bit >= msi->num_of_vectors)
> > return -ENOSPC;
> >
> > set_bit(bit, msi->used);
> >
> > mask = readw(msi->gcsr_regs_base + LINK_GCSR6_OFFSET);
> > mask |= BIT(bit) << 12;
> > writew(mask, msi->gcsr_regs_base + LINK_GCSR6_OFFSET);
> >
> > irq_domain_set_info(domain, virq, bit, &eq_msi_bottom_irq_chip,
> > domain->host_data, handle_level_irq,
>
> This is wrong. MSI is edge type, not level and you are really mixing up
> the concepts here.
>
> The fact that the MSI block raises a level interrupt on the output side
> has absolutely nothing to do with the type of the MSI interrupt itself.
>
> MSI is edge type by definition and this does not change just because
> there is a translation unit between the MSI interrupt and the CPU
> controller.
>
> The actual MSI interrupts do not even know about the existance of that
> MSI block at all. They do not care, as all they need to know is a
> message and an address. When an interrupt is raised in the device the
> MSI chip associated to the device (PCI or something else) writes this
> message to the address exactly ONCE. And this exactly ONCE defines the
> edge nature of MSI.
OK, now I understand my mistake. thanks.
>
> A proper designed MSI device should not send another message before the
> interrupt handler which is associated to the device has handled the
> interrupt at the device level.
By "MSI device" you mean the MSI controller in the SOC or the endpoint
that sends the MSI ?
>
> So you really have to understand that the GIC interrupt and the MSI
> interrupts are two different entities. They just have a 'connection'
> because the message/address which is handed to the MSI device triggers
> that GIC interrupt via the MSI translation unit. But they are still
> different and independent entities.
>
> See?
>
> Thanks,
>
> tglx
>
>
>
>
>
>