Re: [PATCH V2 1/2] mm: hotplug: implement non-movable version ofget_user_pages() called get_user_pages_non_movable()

From: Tang Chen
Date: Tue May 14 2013 - 22:06:44 EST


Hi Benjamin, Mel,

Please see below.

On 05/14/2013 09:58 PM, Benjamin LaHaise wrote:
On Tue, May 14, 2013 at 09:24:58AM +0800, Tang Chen wrote:
Hi Mel, Benjamin, Jeff,

On 05/13/2013 11:01 PM, Benjamin LaHaise wrote:
On Mon, May 13, 2013 at 10:54:03AM -0400, Jeff Moyer wrote:
How do you propose to move the ring pages?

It's the same problem as doing a TLB shootdown: flush the old pages from
userspace's mapping, copy any existing data to the new pages, then
repopulate the page tables. It will likely require the addition of
address_space_operations for the mapping, but that's not too hard to do.


I think we add migrate_unpin() callback to decrease page->count if
necessary,
and migrate the page to a new page, and add migrate_pin() callback to pin
the new page again.

You can't just decrease the page count for this to work. The pages are
pinned because aio_complete() can occur at any time and needs to have a
place to write the completion events. When changing pages, aio has to
take the appropriate lock when changing one page for another.

In aio_complete(),

aio_complete() {
......
spin_lock_irqsave(&ctx->completion_lock, flags);
//write the completion event.
spin_unlock_irqrestore(&ctx->completion_lock, flags);
......
}

So for this problem, I think we can hold ctx->completion_lock in the aio
callbacks to prevent aio subsystem accessing pages who are being migrated.


The migrate procedure will work just as before. We use callbacks to
decrease
the page->count before migration starts, and increase it when the migration
is done.

And migrate_pin() and migrate_unpin() callbacks will be added to
struct address_space_operations.

I think the existing migratepage operation in address_space_operations can
be used. Does it get called when hot unplug occurs? That is: is testing
with the migrate_pages syscall similar enough to the memory removal case?


But as I said, for anonymous pages such as aio ring buffer, they don't have
address_space_operations. So where should we put the callbacks' pointers ?

Add something like address_space_operations to struct anon_vma ?

Thanks. :)






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/