Re: [PATCH] uswsusp: automatically free the in-memory image onces2disk has finished with it

From: Alan Jenkins
Date: Thu Dec 03 2009 - 08:03:20 EST


Pavel Machek wrote:
On Wed 2009-12-02 22:25:16, Mel Gorman wrote:
On Wed, Dec 02, 2009 at 11:15:24PM +0100, Pavel Machek wrote:
On Wed 2009-12-02 22:07:18, Mel Gorman wrote:
On Wed, Dec 02, 2009 at 10:11:07PM +0100, Pavel Machek wrote:
On Wed 2009-12-02 14:28:12, Alan Jenkins wrote:
The original in-kernel suspend (swsusp) frees the in-memory hibernation
image before powering off the machine. s2disk doesn't, so there is
_much_ less free memory when it tries to power off.

This is a gratuitous difference. The userspace suspend interface
/dev/snapshot only allows the hibernation image to be read once.
Once the s2disk program has read the last page, we can free the entire
image.

This avoids a hang after writing the hibernation image which was
triggered by commit 5f8dcc21211a3d4e3a7a5ca366b469fb88117f61
"page-allocator: split per-cpu list into one-list-per-migrate-type":
Yes, you work around page-allocator hang. But is it right thing to do?

What's wrong with it? The hang is likely because the allocator has no
memory to work with. The patch in question makes small changes to the
amount of available memory but it shouldn't matter on uni-core. Some
structures are slightly larger but it's extremely borderline. I'm at a
loss to explain actually why it makes a difference untill things were
extremely borderline to begin with.
We reserve 4MB, for such purposes, and we already wrote image to disk
with such constrains, so memory should not be _too_ tight.

Can you try increasing PAGES_FOR_IO to 8MB or something like that?

What's wrong with just freeing the memory that is no longer required?

Nothing. But 4MB was enough to power down before, it is not enough
now, and I'd like to understand why.
Pavel

Here's a new datum:

Applying this patch has left a less frequent hang. So far it has happened twice. (Once playing last night, and once today testing hibernation with KMS enabled).

This hang happens at a different point. It happens _before_ writing out the hibernation image. That is, I don't see the textual progress bar, and if I force a power-cycle then it doesn't resume (and complains about uncleanly unmounted filesystems).

Here is the backtrace:

[top of screen]
s2disk D c1c05580 0 5988 5809 0x00000000
...
Call Trace:
...
? wait_for_common
? default_wake_function
? kthread_create
? worker_thread
? create_workqueue_thread
? worker_thread
? __create_workqueue_thread
? stop_machine_create
? disable_nonboot_cpus
? hibernation_snapshot
? snapshot_ioctl
...
? sys_ioctl


It looks like hibernation_snapshot() calls disable_nonboot_cpus() _before_ we allocate the hibernation image. (I.e. before swsusp_arch_suspend(), which calls swsusp_save()).

So I think Pavel's right, we still need to work out what's happening here.

Regards
Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/