Re: [4/5] 2.6.21-rc5: known regressions

From: Ingo Molnar
Date: Wed Mar 28 2007 - 08:20:44 EST



* Adrian Bunk <bunk@xxxxxxxxx> wrote:

> Subject : second suspend to disk in a row results in an oops (MSI)
> References : http://lkml.org/lkml/2007/3/17/43
> http://lkml.org/lkml/2007/3/22/150
> http://lkml.org/lkml/2007/3/26/205
> http://lkml.org/lkml/2007/3/26/76
> Submitter : Thomas Meyer <thomas@xxxxxxxx>
> Frédéric Riss <frederic.riss@xxxxxxxxx>
> Marcus Better <marcus@xxxxxxxxx>
> Handled-By : Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
> Patch : http://lkml.org/lkml/2007/3/24/136
> Status : patch was suggested

i can reproduce a crash on the second suspend-to-ram, on a T60. I get a
crash here:

#ifdef CONFIG_PM
static void __pci_restore_msi_state(struct pci_dev *dev)
{
int pos;
u16 control;
struct msi_desc *entry;

if (!dev->msi_enabled)
return;

entry = get_irq_msi(dev->irq);
pos = entry->msi_attrib.pos; <-------- crash on NULL dereference


i.e. 'entry' is NULL after get_irq_msi(). (i can see the crash only on
the VGA screen so no dump of it available. Can write down more info if
it's helpful.)

I have tried Eric's patch above but now i always get a hang after
"system 00:00: resuming", already upon the first suspend-resume. Not
even the NMI watchdog can get the system out of that hang.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/