Re: [RFC][PATCH 3/3] sched: User Mode Concurency Groups

From: Peter Zijlstra
Date: Fri Dec 24 2021 - 06:28:41 EST


On Tue, Dec 14, 2021 at 09:44:48PM +0100, Peter Zijlstra wrote:

> The big assumption this whole thing is build on is that
> pin_user_pages() preserves user mappings in so far that
> pagefault_disable() will never generate EFAULT (unless the user does
> munmap() in which case it can keep the pieces).
>
> shrink_page_list() does page_maybe_dma_pinned() before try_to_unmap()
> and as such seems to respect this constraint.
>
> unmap_and_move() however seems willing to unmap otherwise pinned (and
> hence unmigratable) pages. This might need fixing.

AFAICT this should mostly do,.. I still need to check if
get_user_pages_fast() is itself sufficient to avoid all races or if we
need to strengthen/augment that too.

---
mm/migrate.c | 10 +++++++++-
mm/mprotect.c | 6 ++++++
2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index cf25b00f03c8..3850b33c64eb 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1472,7 +1472,15 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
nr_subpages = thp_nr_pages(page);
cond_resched();

- if (PageHuge(page))
+ /*
+ * If the page has a pin then expected_page_refs() will
+ * not match and the whole migration will fail later
+ * anyway, fail early and preserve the mappings.
+ */
+ if (page_maybe_dma_pinned(page))
+ rc = -EAGAIN;
+
+ else if (PageHuge(page))
rc = unmap_and_move_huge_page(get_new_page,
put_new_page, private, page,
pass > 2, mode, reason,
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 9a105fce5aeb..093db725d39f 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -105,6 +105,12 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
if (page_is_file_lru(page) && PageDirty(page))
continue;

+ /*
+ * Can't migrate pinned pages, avoid touching them.
+ */
+ if (page_maybe_dma_pinned(page))
+ continue;
+
/*
* Don't mess with PTEs if page is already on the node
* a single-threaded process is running on.