Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2:hang in atomic copy)

From: Linus Torvalds
Date: Wed Apr 25 2007 - 19:58:26 EST




On Thu, 26 Apr 2007, Pavel Machek wrote:
>
> > For suspend to ram, in contrast, since you *know* that nobody will be
> > touching the hardware, and since the timings are very different anyway
> > (you'd hope that you can resume in a second or two), you'd generally want
> > to keep the DMA engine tables right where they are, and just literally
> > suspend the PCI chip itself.
>
> I'd actually prefer resume to be similar to module insert, too... Do
> you think that resume is _that_ time critical?

I think it probably depends on the device, and it should depend on the
driver writer how he wants to do it.

My _point_ is that there is absolutely zero reason to think that the two
events are the same. We *know* that for snapshot+shutdown, we need to
actually keep the DMA tables intact *over* the snapshot (because writing
out the snapshot may _need_ them). But exactly because we keep them
intact, a driver writer may sanely say "I didn't even bother shutting them
down, so on thaw, I cannot trust them, so I'll just re-initialize them
entirely".

In contrast, over suspend-to-ram, it's entirely reasonable to just leave
them in memory, and just keep them. There's no *reason* not to.

And that's my whole point in this argument: the two paths are
fundamentally totally different. You *claim* that "snapshot()" needs to
stop DMA etc, but that's simply not true.

So I claim:
- for a lot of devices, it's actually a *lot* easier to just have
snapshot not do anythign at all, and re-initialze on thaw
- for those same devices, for s2ram, since the tables are *safe* and
don't get touched by anything else, it's probably easier to just let
them be.

See? The "it's easier to do X" is a _different_ X for the two cases.

So the whole "suspend is a superset of freeze" is simply not true.

> [I'd like you to drop me a line saying you understand current design
> and that it works -- even if it is very inelegant]

I _do_ understand the current design. I just think that it's totally
and seriously broken. I've ranted against it before. I think it's stupid
to play like you're "suspending" something just to save some state,
especially since most users probably don't even *want* to suspend the
state, and would quite happily re-initialize the chip instead.

And I think it's horrible to have a dynamic flag to tell the difference
between two or more state changes that the devices should statically be
able to determine. _If_ some driver really does have the same routine,
just use the same routine. There are no downsides to splitting them up.

> Now, we can separate suspend/freeze and resume/thaw (with some common
> helpers). It will speed the code up by avoiding unneccessary
> operations. It also needs attetion from driver writers (ouch).
>
> Do we want to do that?

I'd personally certainly want to do that. But I want to split up the
callers too. Right now we mix those a lot as well. I suspect that would
automatically be fixed by just forcing them to separate out (since they
now call different functions of the devices), but I'm not 100% sure. There
might be other issues.

Just as an example: one of the most painful things there is in the suspend
sequence is that we shut off the console (because the console device will
be suspended in hw, and it's thus not safe to use it over a suspend/resume
sequence). That should just go away entirely for "snapshot()", because
there is *never* any excuse for actually turning off the console during a
snapshot: even a network console should be entirely functional. Things
like that - things that sound like small issues, but that really aren't.

(Right now you can enable the "don't disable the console" config option,
but since network drivers will actually shut down etc, it just means that
you'll have oopses etc if you do, and you have netconsole enabled)

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/