Re: [PATCH]: Compress hibernation image with LZO (in-kernel)

From: Nigel Cunningham
Date: Tue Aug 03 2010 - 22:37:27 EST


Hi.

On 04/08/10 12:18, KAMEZAWA Hiroyuki wrote:
On Wed, 04 Aug 2010 12:14:19 +1000
Bojan Smojver<bojan@xxxxxxxxxxxxx> wrote:

On Wed, 2010-08-04 at 11:02 +0900, KAMEZAWA Hiroyuki wrote:
Then, after resume, all vmalloc() area is resumed as "allocated".

Wrong ?

I actually tried remembering vmalloc() returned pointers into a global
variable as you suggested. On resume, they were always set to NULL,
which would suggest that what has gotten into the image was the state
before vmalloc() was called in save_image(). See:
http://lkml.org/lkml/2010/8/2/537.

Anyone else wants to comment here?

Hmm, ok. let's see the result.

The reason I mention about the race is my patch corrupts saved image
by changing swap_map[] status and swap-cache radix-tree during save_image().

Maybe I don't understand something important.

That's a different issue.

Remember that the snapshot includes more than just the running programs. It includes structs recording filesystem info and the state of swap. This is why we say you can't safely hibernate, use your filesystem from another kernel or OS, then resume. The use of the filesystem in another kernel/OS makes the state on disk inconsistent with the state in memory that we saved in our image. (I'm assuming it's written to or at least that the journal is replayed).

I'm not 100% sure, but it sounds like your issue is the same, but with swap. If you free a swap page post-snapshot and it gets used for (say) saving a page of the image, then you have a problem post-resume. The resumed kernel will think the swap state is still as it originally was and might try to swap back in the page of memory that was freed and used for the snapshot, creating in-memory corruption.

One solution is to allocate the swap for the image before the snapshot. This is what TuxOnIce does - it freezes processes, calculates the image statistics and uses them to allocate storage and free memory as necessary, writes the first part of the image then does the snapshot and writes the remainder. By doing things in this order, the only unknown is the amount of memory needed for drivers, and that can be handled pretty easily.

Regards,

Nigel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/