Re: Frontswap [PATCH 0/4] (was Transcendent Memory): overview

From: Avi Kivity
Date: Fri Apr 23 2010 - 11:01:49 EST


On 04/23/2010 05:52 PM, Avi Kivity wrote:

I see. So why not implement this as an ordinary swap device, with a higher priority than the disk device? this way we reuse an API and keep things asynchronous, instead of introducing a special purpose API.


Ok, from your original post:

An "init" prepares the pseudo-RAM to receive frontswap pages and returns
a non-negative pool id, used for all swap device numbers (aka "type").
A "put_page" will copy the page to pseudo-RAM and associate it with
the type and offset associated with the page. A "get_page" will copy the
page, if found, from pseudo-RAM into kernel memory, but will NOT remove
the page from pseudo-RAM. A "flush_page" will remove the page from
pseudo-RAM and a "flush_area" will remove ALL pages associated with the
swap type (e.g., like swapoff) and notify the pseudo-RAM device to refuse
further puts with that swap type.

Once a page is successfully put, a matching get on the page will always
succeed. So when the kernel finds itself in a situation where it needs
to swap out a page, it first attempts to use frontswap. If the put returns
non-zero, the data has been successfully saved to pseudo-RAM and
a disk write and, if the data is later read back, a disk read are avoided.
If a put returns zero, pseudo-RAM has rejected the data, and the page can
be written to swap as usual.

Note that if a page is put and the page already exists in pseudo-RAM
(a "duplicate" put), either the put succeeds and the data is overwritten,
or the put fails AND the page is flushed. This ensures stale data may
never be obtained from pseudo-RAM.

Looks like "init" == open, "put_page" == write, "get_page" == read, "flush_page|flush_area" == trim. The only difference seems to be that an overwriting put_page may fail. Doesn't seem to be much of a win, since a guest can simply avoid issuing the duplicate put_page, so the hypervisor is still committed to holding this memory for the guest.

--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/