Re: [PATCH v14 05/10] mm: introduce memfd_secret system call to create "secret" memory areas

From: Matthew Wilcox
Date: Tue Jan 19 2021 - 15:30:27 EST


On Thu, Dec 03, 2020 at 08:29:44AM +0200, Mike Rapoport wrote:
> +static vm_fault_t secretmem_fault(struct vm_fault *vmf)
> +{
> + struct address_space *mapping = vmf->vma->vm_file->f_mapping;
> + struct inode *inode = file_inode(vmf->vma->vm_file);
> + pgoff_t offset = vmf->pgoff;
> + vm_fault_t ret = 0;
> + unsigned long addr;
> + struct page *page;
> + int err;
> +
> + if (((loff_t)vmf->pgoff << PAGE_SHIFT) >= i_size_read(inode))
> + return vmf_error(-EINVAL);
> +
> + page = find_get_page(mapping, offset);
> + if (!page) {
> +
> + page = secretmem_alloc_page(vmf->gfp_mask);
> + if (!page)
> + return vmf_error(-ENOMEM);

Just use VM_FAULT_OOM directly.

> + err = add_to_page_cache(page, mapping, offset, vmf->gfp_mask);
> + if (unlikely(err))
> + goto err_put_page;

What if the error is EEXIST because somebody else raced with you to add
a new page to the page cache?

> + err = set_direct_map_invalid_noflush(page, 1);
> + if (err)
> + goto err_del_page_cache;

Does this work correctly if somebody else has a reference to the page
in the meantime?

> + addr = (unsigned long)page_address(page);
> + flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +
> + __SetPageUptodate(page);

Once you've added it to the cache, somebody else can come along and try
to lock it. They will set PageWaiter. Now you call __SetPageUptodate
and wipe out their PageWaiter bit. So you won't wake them up when you
unlock.

You can call __SetPageUptodate before adding it to the page cache,
but once it's visible to another thread, you can't do that.

> + ret = VM_FAULT_LOCKED;
> + }
> +
> + vmf->page = page;

You're supposed to return the page locked, so use find_lock_page() instead
of find_get_page().

> + return ret;
> +
> +err_del_page_cache:
> + delete_from_page_cache(page);
> +err_put_page:
> + put_page(page);
> + return vmf_error(err);
> +}