Re: What should PCI core do during suspend-resume? (was: Re:2.6.29-rc3: tg3 dead after resume)

From: Linus Torvalds
Date: Sat Jan 31 2009 - 18:40:11 EST




On Sat, 31 Jan 2009, Linus Torvalds wrote:
>
> But how many people test STR while doing a "ping -f" from another machine?
>
> It _should_ work. Do you guarantee that it does?

Btw, this really only is interesting if there's a shared interrupt.

I'm sure that there are network drivers that will crash even on their own
with _just_ the right timing (imagine having a delayed interrupt pending,
then doing the "pci_set_power_state(PCI_D3hot)" thing, and then get the
interrupt handler invoked on another CPU _just_ afterwards), but it's
probably really hard to trigger, and a bug in that specific driver anyway.

But what's much more interesting (and not necessarily a driver bug, but a
general PM infrastructure problem) is if we have that shared interrupt
case, and the network driver gets lots of interrupts just as "driver X" is
shutting down with that interrupt shared. Then, "driver X" will get
interrupts after the PM layer has put its device to sleep, and now "driver
X" is quite understandably confused - it didn't even do the "put to sleep"
itself, but now its device is no longer responding.

And now it's not a really unlikely race condition any more.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/