Re: do_IRQ: 1.55 No irq handler for vector (irq -1)

From: Borislav Petkov
Date: Tue Aug 07 2012 - 16:57:15 EST


On Tue, Aug 07, 2012 at 10:45:30AM -0700, Eric W. Biederman wrote:
> >> [ 0.170256] AMD PMU driver.
> >> [ 0.170451] ... version: 0
> >> [ 0.170683] ... bit width: 48
> >> [ 0.170906] ... generic registers: 6
> >> [ 0.171125] ... value mask: 0000ffffffffffff
> >> [ 0.171399] ... max period: 00007fffffffffff
> >> [ 0.171673] ... fixed-purpose events: 0
> >> [ 0.171902] ... event mask: 000000000000003f
> >> [ 0.172687] MCE: In-kernel MCE decoding enabled.
> >> [ 0.184214] [Firmware Info]: CPU: Re-enabling disabled Topology Extensions Support
> >> [ 0.186687] do_IRQ: 1.55 No irq handler for vector (irq -1) <---
> >> [ 0.198126] [Firmware Info]: CPU: Re-enabling disabled Topology Extensions Support
> >> [ 0.200579] do_IRQ: 2.55 No irq handler for vector (irq -1) <---
> >> [ 0.173040] smpboot: Booting Node 0, Processors #1 #2 #3 OK
> >> [ 0.212083] [Firmware Info]: CPU: Re-enabling disabled Topology Extensions Support
> >> [ 0.214538] do_IRQ: 3.55 No irq handler for vector (irq -1) <---
> >> [ 0.214864] Brought up 4 CPUs
> >>
> >> of it now having IRQ handler for vector 55.
> >>
> >> And guess what: reverting those three above make the message go away
> >> too.
> >>
> >
> > Boris, Robert, Can you please send me the complete dmesg
> > and /proc/interrupts on a successful boot?
>
> Hmm. I wonder if this is one of those cases where the apics don't honor
> the masks in lowest priority delivery mode and simply deliver to some
> cpu in the same die.

The funny thing is, they deliver to all CPUs except the BSP.

Or maybe the BSP gets that IRQ too but it actually has a handler
registered?

Btw, I'm stabbing in the dark here - I have been purposefully and
willfully keeping away from all the APIC debacle until now. I guess that
carefree time is over :(.

> Certainly outside of x2apic mode I have seen that happen and that is why
> the reservation in lowest priroity delivery mode was for the same vector
> across all cpus.
>
> This certainly looks like we have one irq going across multiple cpus
> and the software simply appears unprepared for the irq to show up where
> the irq is showing up.

The interesting thing is that this happens once per core early during
boot and not anymore. I dropped the printk_ratelimit() in do_IRQ and
still got those lines only once in dmesg.

The other funny thing is, irq 55 is not in /proc/interrupts:

CPU0 CPU1 CPU2 CPU3
0: 44 0 0 0 IO-APIC-edge timer
1: 2 1 2 4 IO-APIC-edge i8042
8: 6 7 6 6 IO-APIC-edge rtc0
9: 22 25 24 21 IO-APIC-fasteoi acpi
12: 31 23 30 30 IO-APIC-edge i8042
16: 82 82 81 117 IO-APIC-fasteoi snd_hda_intel
17: 0 1 1 0 IO-APIC-fasteoi ehci_hcd:usb1, ehci_hcd:usb2
18: 3 6 8 8 IO-APIC-fasteoi ohci_hcd:usb3, ohci_hcd:usb4, ohci_hcd:usb5
40: 0 0 0 0 PCI-MSI-edge PCIe PME
41: 0 0 0 0 PCI-MSI-edge PCIe PME
42: 0 0 0 0 PCI-MSI-edge PCIe PME
43: 0 0 0 0 PCI-MSI-edge PCIe PME
44: 675 662 676 690 PCI-MSI-edge ahci
45: 41 44 38 41 PCI-MSI-edge snd_hda_intel
46: 13484 13499 13501 13536 PCI-MSI-edge eth0
NMI: 0 0 0 0 Non-maskable interrupts
LOC: 20719 21487 18015 16445 Local timer interrupts
SPU: 0 0 0 0 Spurious interrupts
PMI: 0 0 0 0 Performance monitoring interrupts
IWI: 0 0 0 0 IRQ work interrupts
RTR: 0 0 0 0 APIC ICR read retries
RES: 13744 12640 13425 12334 Rescheduling interrupts
CAL: 571 790 539 801 Function call interrupts
TLB: 0 0 0 0 TLB shootdowns
TRM: 0 0 0 0 Thermal event interrupts
THR: 0 0 0 0 Threshold APIC interrupts
MCE: 0 0 0 0 Machine check exceptions
MCP: 66 66 66 66 Machine check polls
ERR: 0
MIS: 0

so what is that thing?

I'll read up on lowest prio delivery mode tomorrow.

Thanks.

--
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/