x86_64, Haswell, unhandled vga irqs before i915 loaded

From: Nikolai Zhubr
Date: Sat May 14 2016 - 04:22:10 EST


Hello all,

Some of my motherboards exhibit the [in]famous bug of throwing unwanted (and unhandled) irqs from integrated intel video core on monitor cable plug/unplug events when there is no specific chipset driver loaded (yet). Searching some forums yields this apparently started happening (sometimes?) starting from Sandy Bridge, affecting both analog VGA output and digital (DVI, HDMI) outputs.

Currently, I observe this with 4.1.13 kernel (I could also test some later one but I suppose nothing relevant has changed since). Unfortunately the warning message from the kernel in this particular case is extremally unhelpfull and confusing, because the irq line in question usually happen to be shared with some usb and/or netcard etc., so kernel keeps talking about usb and/or netcard, so people end up being totally confused.

How to reproduce:
- get some relevant motherboard (Mine is gigabyte ga-b85-hd3 rev 2.1)
- ensure i915.ko (or any fb drivers) not automatically loaded on boot.
- watch dmesg and /proc/interupts
- start plugging/unplugging VGA or HDMI monitor cable.

What you will see then (approxmately):

/proc/interupts:
...
16: 100001 0 0 IR-IO-APIC-fasteoi ehci_hcd:usb5
...
(100000 counter increase in a fraction of second)

dmesg:
irq 16: nobody cared (try booting with the "irqpoll" option)
...
Disabling IRQ #16

Now meanwhile freebsd people have apparently managed to fix the problem with a one-liner patch, but I can not test it yet because adapting this patch for linux is a bit beyond my capability.
This is the discussion:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=156596
This is the patch:
https://svnweb.freebsd.org/base?view=revision&revision=284012
This is the commit log:
Disable VGA PCI interrupts until a chipset driver is loaded for VGA
PCI devices. Else unhandled display adapter interrupts might freeze
the CPU or consume a lot of CPU.

My understanding is that the problem is caused by buggy BIOS rather than the kernel, but e.g. in my case no usefull BIOS update is available (yet?), and still I'd like to "fix" it anyhow.

*** Please CC me, I'm not subscribed!

Thank you,

Regards,
Nikolai