Re: "do_IRQ: 0.89 No irq handler for vector (irq -1)"

From: Jesse Barnes
Date: Mon Oct 11 2010 - 18:45:20 EST


On Fri, 8 Oct 2010 21:46:50 +1000
Dave Airlie <airlied@xxxxxxxxx> wrote:

> On Fri, Oct 8, 2010 at 5:52 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> > On Thu, 7 Oct 2010, Thomas Gleixner wrote:
> > ÂOct Â7 23:21:24 ionos kernel: Console: switching to colour VGA+ 80x25
> >> Oct Â7 23:21:31 ionos kernel: drm: unregistered panic notifier
> >> Oct Â7 23:21:31 ionos kernel: vga_switcheroo: disabled
> >> Oct Â7 23:21:31 ionos kernel: [drm:drm_mm_takedown] *ERROR* Memory manager not clean. Delaying takedown
> >>
> >> That one scares me :)
> >>
> >> Oct Â7 23:21:32 ionos kernel: BUG: unable to handle kernel paging request at 00000037362e313a
> >>
> >> We are again dereferencing a user space address.
> >
> > Further debugging shows that the interrupt is torn down and the
> > vectors are cleared. On modprobe the irq is set up again and a
> > different vector is assigned.
> >
> > The interrupt which comes in is going to the old vector. So something
> > is stale in the card.
>
> Okay I've traced it with the hints Thomas gave me,
>
> What happens is we have never called pci_disable_device on video
> devices for various reasons, like shutting down VGA devices could be
> hostile,
>
> however on rmmod of the driver the PCI layer sets the device power
> state to PCI_UNKNOWN, when we next load the driver we call
> pci_enable_device
> which sees the enable cnt is 1, so never calls
> pci_set_power_state(PCI_D0), the MSI code won't actually write MSI
> msgs unless it knows we are in D0.
>
> Not sure how best to fix, I can workaround by calling
> pci_set_power_state(PCI_D0) in the drm drivers, but I sorta thing the
> PCI layer should take care of this.

So I think we *should* be able to call pci_disable_device at remove
time. But as you say, some platforms may not correctly re-route VGA
space to an existing device or disable it properly when we do that.
AFAICT x86 will be ok here though (seems to work ok locally too).

That said, it seems like we should update the current device state at
load time as well, once we've matched the driver it seems like there
should be no harm.

Rafael, what do you think? Would having the correct power state at
load time cause any trouble with other PM code? I know we've had
issues with setting it explicitly in the past...

Thanks,
--
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/