Re: [v2 PATCH 1/2 -mm] mm: mremap: dwongrade mmap_sem to read when shrinking

From: Yang Shi
Date: Thu Sep 27 2018 - 12:05:26 EST




On 9/27/18 4:50 AM, Vlastimil Babka wrote:
On 9/26/18 8:10 PM, Yang Shi wrote:
Subject: [v2 PATCH 1/2 -mm] mm: mremap: dwongrade mmap_sem to read
when shrinking

"downgrade" in the subject

Will fix in the next version.

Thanks,
Yang


Other than munmap, mremap might be used to shrink memory mapping too.
So, it may hold write mmap_sem for long time when shrinking large
mapping, as what commit ("mm: mmap: zap pages with read mmap_sem in
munmap") described.

The mremap() will not manipulate vmas anymore after __do_munmap() call for
the mapping shrink use case, so it is safe to downgrade to read mmap_sem.

So, the same optimization, which downgrades mmap_sem to read for zapping
pages, is also feasible and reasonable to this case.

The period of holding exclusive mmap_sem for shrinking large mapping
would be reduced significantly with this optimization.

MREMAP_FIXED and MREMAP_MAYMOVE are more complicated to adopt this
optimization since they need manipulate vmas after do_munmap(),
downgrading mmap_sem may create race window.

Simple mapping shrink is the low hanging fruit, and it may cover the
most cases of unmap with munmap together.

Cc: Michal Hocko <mhocko@xxxxxxxxxx>
Cc: Kirill A. Shutemov <kirill@xxxxxxxxxxxxx>
Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
Cc: Laurent Dufour <ldufour@xxxxxxxxxxxxxxxxxx>
Cc: Vlastimil Babka <vbabka@xxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Yang Shi <yang.shi@xxxxxxxxxxxxxxxxx>
Looks fine,

Acked-by: Vlastimil Babka <vbabka@xxxxxxx>

Nit:

--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2687,8 +2687,8 @@ int split_vma(struct mm_struct *mm, struct vm_area_struct *vma,
* work. This now handles partial unmappings.
* Jeremy Fitzhardinge <jeremy@xxxxxxxx>
*/
-static int __do_munmap(struct mm_struct *mm, unsigned long start, size_t len,
- struct list_head *uf, bool downgrade)
+int __do_munmap(struct mm_struct *mm, unsigned long start, size_t len,
+ struct list_head *uf, bool downgrade)
{
unsigned long end;
struct vm_area_struct *vma, *prev, *last;
diff --git a/mm/mremap.c b/mm/mremap.c
index 5c2e185..8f1ec2b 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -525,6 +525,7 @@ static int vma_expandable(struct vm_area_struct *vma, unsigned long delta)
unsigned long ret = -EINVAL;
unsigned long charged = 0;
bool locked = false;
+ bool downgrade = false;
Maybe "downgraded" is more accurate here, or even "downgraded_mmap_sem".