Re: [PATCH] mm: munlock use mapcount to avoid terrible overhead

From: Andrew Morton
Date: Tue Oct 18 2011 - 20:14:59 EST

Next message: Åukasz Sowa: "[RFC] cgroup: syscalls limiting subsystem"
Previous message: Joe Perches: "Re: [PATCH 0/3] ARM 4Kstacks: introduction"
In reply to: Hugh Dickins: "[PATCH] mm: munlock use mapcount to avoid terrible overhead"
Next in thread: Hugh Dickins: "Re: [PATCH] mm: munlock use mapcount to avoid terrible overhead"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, 18 Oct 2011 17:02:56 -0700 (PDT)
Hugh Dickins <hughd@xxxxxxxxxx> wrote:

> A process spent 30 minutes exiting, just munlocking the pages of a large
> anonymous area that had been alternately mprotected into page-sized vmas:
> for every single page there's an anon_vma walk through all the other
> little vmas to find the right one.
>
> A general fix to that would be a lot more complicated (use prio_tree on
> anon_vma?), but there's one very simple thing we can do to speed up the
> common case: if a page to be munlocked is mapped only once, then it is
> our vma that it is mapped into, and there's no need whatever to walk
> through all the others.
>
> Okay, there is a very remote race in munlock_vma_pages_range(), if
> between its follow_page() and lock_page(), another process were to
> munlock the same page, then page reclaim remove it from our vma, then
> another process mlock it again. We would find it with page_mapcount
> 1, yet it's still mlocked in another process. But never mind, that's
> much less likely than the down_read_trylock() failure which munlocking
> already tolerates (in try_to_unmap_one()): in due course page reclaim
> will discover and move the page to unevictable instead.
>

And how long did the test case take with the patch applied?

> ---
> mm/mlock.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> --- 3.1-rc10/mm/mlock.c 2011-07-21 19:17:23.000000000 -0700
> +++ linux/mm/mlock.c 2011-10-06 12:47:54.670436979 -0700
> @@ -110,7 +110,10 @@ void munlock_vma_page(struct page *page)
> if (TestClearPageMlocked(page)) {
> dec_zone_page_state(page, NR_MLOCK);
> if (!isolate_lru_page(page)) {
> - int ret = try_to_munlock(page);
> + int ret = SWAP_AGAIN;
> +
> + if (page_mapcount(page) > 1)
> + ret = try_to_munlock(page);
> /*
> * did try_to_unlock() succeed or punt?
> */

tsk.

--- a/mm/mlock.c~mm-munlock-use-mapcount-to-avoid-terrible-overhead-fix
+++ a/mm/mlock.c
@@ -112,6 +112,11 @@ void munlock_vma_page(struct page *page)
if (!isolate_lru_page(page)) {
int ret = SWAP_AGAIN;

+ /*
+ * Optimization: if the page was mapped just once,
+ * that's our mapping and we don't need to check all the
+ * other vmas.
+ */
if (page_mapcount(page) > 1)
ret = try_to_munlock(page);
/*
_

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Åukasz Sowa: "[RFC] cgroup: syscalls limiting subsystem"
Previous message: Joe Perches: "Re: [PATCH 0/3] ARM 4Kstacks: introduction"
In reply to: Hugh Dickins: "[PATCH] mm: munlock use mapcount to avoid terrible overhead"
Next in thread: Hugh Dickins: "Re: [PATCH] mm: munlock use mapcount to avoid terrible overhead"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]