Re: [PATCH v2 3/4] mm: Do early cow for pinned pages during fork() for ptes

From: Jason Gunthorpe
Date: Sat Sep 26 2020 - 19:27:19 EST


On Fri, Sep 25, 2020 at 06:25:59PM -0400, Peter Xu wrote:
> -static inline void
> +/*
> + * Copy one pte. Returns 0 if succeeded, or -EAGAIN if one preallocated page
> + * is required to copy this pte.
> + */
> +static inline int
> copy_present_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
> pte_t *dst_pte, pte_t *src_pte, struct vm_area_struct *vma,
> - unsigned long addr, int *rss)
> + struct vm_area_struct *new,
> + unsigned long addr, int *rss, struct page **prealloc)
> {
> unsigned long vm_flags = vma->vm_flags;
> pte_t pte = *src_pte;
> struct page *page;
>
> + page = vm_normal_page(vma, addr, pte);
> + if (page) {
> + if (is_cow_mapping(vm_flags)) {
> + bool is_write = pte_write(pte);

Very minor, but I liked the readability to put this chunk in a
function 'copy_normal_page' with the src/dst naming

> +
> + /*
> + * We have a prealloc page, all good! Take it
> + * over and copy the page & arm it.
> + */
> + *prealloc = NULL;
> + copy_user_highpage(new_page, page, addr, vma);
> + __SetPageUptodate(new_page);
> + pte = mk_pte(new_page, new->vm_page_prot);
> + pte = pte_sw_mkyoung(pte);

Linus's version doesn't do pte_sw_mkyoung(), but looks OK to have it

> + pte = maybe_mkwrite(pte_mkdirty(pte), new);

maybe_mkwrite() was not in Linus's version, but is in
wp_page_copy(). It seemed like mk_pte() should set the proper write
bit already from the vm_page_prot? Perhaps this is harmless but
redundant?

> + page_add_new_anon_rmap(new_page, new, addr, false);
> + rss[mm_counter(new_page)]++;
> + set_pte_at(dst_mm, addr, dst_pte, pte);

Linus's patch had a lru_cache_add_inactive_or_unevictable() here, like
wp_page_copy()

Didn't think of anything profound to say, looks good thanks!

I'll forward this for testing as well, there are some holidays next
week so I may have been optimistic to think by Monday.

Jason