[RFC V2] mm: change mm_advise_free to clear page dirty

From: Wang, Yalin
Date: Sat Feb 28 2015 - 01:01:58 EST


This patch add ClearPageDirty() to clear AnonPage dirty flag,
if not clear page dirty for this anon page, the page will never be
treated as freeable. we also make sure the shared AnonPage is not
freeable, we implement it by dirty all copyed AnonPage pte,
so that make sure the Anonpage will not become freeable, unless
all process which shared this page call madvise_free syscall.

Another change is that we also handle file map page,
we just clear pte young bit for file map, this is useful,
it can make reclaim patch move file pages into inactive
lru list aggressively.

Signed-off-by: Yalin Wang <yalin.wang@xxxxxxxxxxxxxx>
---
mm/madvise.c | 26 +++++++++++++++-----------
mm/memory.c | 12 ++++++++++--
2 files changed, 25 insertions(+), 13 deletions(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 6d0fcb8..712756b 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -299,30 +299,38 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr,
page = vm_normal_page(vma, addr, ptent);
if (!page)
continue;
+ if (!PageAnon(page))
+ goto set_pte;
+ if (!trylock_page(page))
+ continue;

if (PageSwapCache(page)) {
- if (!trylock_page(page))
- continue;
-
if (!try_to_free_swap(page)) {
unlock_page(page);
continue;
}
-
- ClearPageDirty(page);
- unlock_page(page);
}

/*
+ * we clear page dirty flag for AnonPage, no matter if this
+ * page is in swapcahce or not, AnonPage not in swapcache also set
+ * dirty flag sometimes, this happened when an AnonPage is removed
+ * from swapcahce by try_to_free_swap()
+ */
+ ClearPageDirty(page);
+ unlock_page(page);
+ /*
* Some of architecture(ex, PPC) don't update TLB
* with set_pte_at and tlb_remove_tlb_entry so for
* the portability, remap the pte with old|clean
* after pte clearing.
*/
+set_pte:
ptent = ptep_get_and_clear_full(mm, addr, pte,
tlb->fullmm);
ptent = pte_mkold(ptent);
- ptent = pte_mkclean(ptent);
+ if (PageAnon(page))
+ ptent = pte_mkclean(ptent);
set_pte_at(mm, addr, pte, ptent);
tlb_remove_tlb_entry(tlb, pte, addr);
}
@@ -364,10 +372,6 @@ static int madvise_free_single_vma(struct vm_area_struct *vma,
if (vma->vm_flags & (VM_LOCKED|VM_HUGETLB|VM_PFNMAP))
return -EINVAL;

- /* MADV_FREE works for only anon vma at the moment */
- if (vma->vm_file)
- return -EINVAL;
-
start = max(vma->vm_start, start_addr);
if (start >= vma->vm_end)
return -EINVAL;
diff --git a/mm/memory.c b/mm/memory.c
index 8068893..3d949b3 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -874,10 +874,18 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
if (page) {
get_page(page);
page_dup_rmap(page);
- if (PageAnon(page))
+ if (PageAnon(page)) {
+ /*
+ * we dirty the copyed pte for anon page,
+ * this is useful for madvise_free_pte_range(),
+ * this can prevent shared anon page freed by madvise_free
+ * syscall
+ */
+ pte = pte_mkdirty(pte);
rss[MM_ANONPAGES]++;
- else
+ } else {
rss[MM_FILEPAGES]++;
+ }
}

out_set_pte:
--
2.2.2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/