Re: WARNING: CPU: 4 PID: 863 at include/drm/drm_crtc.h:1577 drm_helper_choose_encoder_dpms+0x88/0x90() - evildoer found and neutralized

From: Jiang Liu
Date: Wed Sep 30 2015 - 03:45:47 EST


This is a multi-part message in MIME format.n 2015/9/29 18:51, Borislav Petkov wrote:
> On Tue, Sep 29, 2015 at 04:50:36PM +0800, Jiang Liu wrote:
>> So could you please help to apply the attached debug patch to gather
>> more information about the regression?
>
> Sure, just did.
>
> I'm sending you a full s/r cycle attempt caught over serial in a private
> message.

Hi Boris,
>From the log file, we got to know that the NULL pointer dereference
was caused by AMD IOMMU device. For normal MSI-enabled PCI devices, we get
valid irq numbers such as:
[ 74.661170] ahci 0000:04:00.0: irqdomain: freeze msi 1 irq28
[ 74.661297] radeon 0000:01:00.0: irqdomain: freeze msi 1 irq47
But for AMD IOMMU device, we got an invalid irq number(0) after
enabling MSI as:
[ 74.662488] pci 0000:00:00.2: irqdomain: freeze msi 1 irq0
which then caused NULL pointer deference when __pci_restore_msi_state()
gets called by system resume code.
So we need to figure out why we got irq number 0 after enabling
MSI for AMD IOMMU device. The only hint I got is that iommu driver just
grabbing the PCI device without providing a PCI device driver for IOMMU
PCI device, we have solved a similar case for eata driver. So could you
please help to apply this debug patch to gather more info and send me
/proc/interrupts?
Thanks!
Gerry

O>
> Thanks.
>