Re: Back to the future.

From: Nigel Cunningham
Date: Fri Apr 27 2007 - 19:57:48 EST


Hi.

On Sat, 2007-04-28 at 01:45 +0200, Rafael J. Wysocki wrote:
> On Saturday, 28 April 2007 01:17, Linus Torvalds wrote:
> >
> > On Sat, 28 Apr 2007, Rafael J. Wysocki wrote:
> > >
> > > > And can you name a _single_ advantage of doing so?
> > >
> > > Yes. We have a lot less interdependencies to worry about during the whole
> > > operation.
> >
> > That's not an advantage. That's why it has *sucked*.
>
> Actually, the less things happen while we're creating and saving the image,
> the less sources of potential problems there are and by freezing the kernel
> threads (not all of them), we cause less things to happen at that time.
>
> To make you happy, we could stop doing that, but what actual _advantage_
> that would bring?

A couple of other advantages to freezing other processes:

1) It makes predicting how much memory is available for making and
saving snapshot a tractable problem. It therefore makes hibernation
_much_ more reliable.
2) Racing against other processes would also make hibernation slower,
increasing the chances of your battery running out before the save is
complete.
3) It makes finding potential memory leaks in the code possible. It was
ages ago now, but at one stage I could display a table saying exactly
how many pages had been allocated and freed by different sections of the
process and compare the number of free pages at the start and end of the
cycle to ensure there were no memory leaks at all.

> > Trying to freeze kernel threads has _caused_ problems. It has _added_
> > these interdependencies. It hasn't removed a single dependency at any
> > time, it has just added new problems!
>
> What problems are you talking about?
>
> > > 1) if the kernel threads are frozen, we know that they don't hold any locks
> > > that could interfere with the freezing of device drivers,
> > > 2) if they are frozen, we know, for example, that they won't call user mode
> > > helpers or do similar things,
> > > 3) if they are frozen, we know that they won't submit I/O to disks and
> > > potentially damage filesystems (suspend2 has much more problems with that
> > > than swsusp, but still. And yes, there have been bug reports related to it,
> > > so it's not just my fantasy).
> >
> > NONE of these are valid explanations at all. You're listing totally
> > theoretical problems, and ignoring all the _real_ problems that trying to
> > freeze kernel threads has _caused_.
>
> Example, please?

I agree with Rafael. Freezing processes greatly helps in ensuring we
have a consistent image. He's right, too, in asserting that it's even
more important for Suspend2. Freezing processes is essential to being
able to know that those LRU pages won't change and therefore being able
to save them separately and then reuse them for the atomic copy.

> > If you want to control user-mode helpers, you do that - you do not freeze
> > kernel threads!
> >
> > And no, kernel threads do not submit IO to disks on their own. You just
> > made that up.
>
> No, I didn't. Nigel can confirm, I think.

I have had problems with MD threads generating I/O that I couldn't
account for - after userspace had been frozen, filesystems had been
nicely synced and so on. I have to speak with reservations though,
because I haven't yet gotten to the bottom of where the I/O is coming
from... too many things, too small time slices.

> > Yes, they can be involved in that whole disk submission thing, but in a good
> > way - they can be required in order to make disk writing work!
>
> Some of them can be, some other's need not be. We don't need any fs-related
> kernel threads for saving the image, for example.

Yeah, so long as we bmap the storage we want to use beforehand (thinking
of swap files and ordinary files).

> > The problem that suspend has had is that it's done everything totally the
> > wrong way around. Do kernel threads do disk IO? Sure, if asked to do so.
>
> They can be asked before we do the snapshot and complete the operation
> afterwards, no?
>
> > For example, kernel threads can be involved in md etc, but that's a *good*
> > thing.
>
> We don't freeze these threads.
>
> > The way to shut them up is not to freeze the threads, but to freeze the *disk*.
>
> In principle, you're right. In practice, go and try it.

I have to disagree here. Freezing the disk instead of the threads is
dealing with the symptoms instead of the cause.

Regards,

Nigel

Attachment: signature.asc
Description: This is a digitally signed message part