Re: x86 and I/O APIC IRQ domains

From: Sebastian Andrzej Siewior
Date: Wed Aug 01 2012 - 10:45:41 EST


On 08/01/2012 03:58 PM, Thierry Reding wrote:
I've been working on an x86 platform and want to use DT. However I've
hit a snag when trying to instantiate the I/O APIC. I've been trying to
follow what the CE4100 does and most things seem to work fine but when
I add the DT node for the I/O APIC things start to fail. I've been able
to trace the issue to x86_add_irq_domains(), which in turn calls
ioapic_add_ofnode() from which irq_domain_add_legacy() is called.

The platform that I use hits the WARN_ON(!irq_data || irq_data->domain).
Looking further this seems to be caused by all irq_get_irq_data(irq)
returning NULL for irq>= 16. That in turn I think is due to
init_ISA_irqs() setting up only the first NR_IRQS_LEGACY interrupts.
However the call to irq_domain_add_legacy() wants 32 interrupts.

The IOAPIC knows how many sources are available and this number should
be used instead of 32.
This won't solve the problem with get_irq_data() for the second ioapic
which might be available in the system.

The reason why there is no irq_data() available is that this is
allocated by io_apic_setup_irq_pin_once() which is now called too late. Usually a PCI device does a pci_enable() call and then we do all this.
So to keep the function happy you should preallocate all interrupts
which are offered. Ah, and you may need a map function which does
nothing because the programming is done at io_apic_setup_irq_pin_once()
time.
Maybe you could live with irq_domain_add_linear() instead. Not sure how
important it is to keep rtc at a fixed irq. I think as far as the
IOAPIC is concerned, it could be programmed to another number but I
kept it in sync. However parts of the ioapic code rely on gsi_number ==
irq number so maybe we should preallocate the irq_data and use a dummy
map() function for the start.

This was introduced by commit b4e5185 "irq_domain/x86: Convert x86
(embedded) to use common irq_domain)". I wonder what I'm doing wrong. I
don't get how this is made to work on CE4100.
Currently it does not

Later the code crashes, but I can't exactly pinpoint the location
because the oops doesn't fit on the screen. I don't have a serial port
that I can use instead, so is there anything else I can do to obtain a
complete backtrace?

There is an option which delays each printk by a few msecs. Maybe this
could help.


Thierry

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/