Re: [PATCH v2 01/17] mm/gup: Fixup p*_access_permitted()

From: Dan Williams
Date: Fri Dec 15 2017 - 11:38:10 EST


On Fri, Dec 15, 2017 at 3:38 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Fri, Dec 15, 2017 at 11:25:29AM +0100, Peter Zijlstra wrote:
>> The memory one is also clearly wrong, not having access does not a write
>> fault make. If we have pte_write() set we should not do_wp_page() just
>> because we don't have access. This falls under the "doing anything other
>> than hard failure for !access is crazy" header.
>
> So per the very same reasoning I think the below is warranted too; also
> rename that @dirty variable, because its also wrong.
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 5eb3d2524bdc..0d43b347eb0a 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3987,7 +3987,7 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
> .pgoff = linear_page_index(vma, address),
> .gfp_mask = __get_fault_gfp_mask(vma),
> };
> - unsigned int dirty = flags & FAULT_FLAG_WRITE;
> + unsigned int write = flags & FAULT_FLAG_WRITE;
> struct mm_struct *mm = vma->vm_mm;
> pgd_t *pgd;
> p4d_t *p4d;
> @@ -4013,7 +4013,7 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
>
> /* NUMA case for anonymous PUDs would go here */
>
> - if (dirty && !pud_access_permitted(orig_pud, WRITE)) {
> + if (write && !pud_write(orig_pud)) {
> ret = wp_huge_pud(&vmf, orig_pud);
> if (!(ret & VM_FAULT_FALLBACK))
> return ret;
> @@ -4046,7 +4046,7 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
> if (pmd_protnone(orig_pmd) && vma_is_accessible(vma))
> return do_huge_pmd_numa_page(&vmf, orig_pmd);
>
> - if (dirty && !pmd_access_permitted(orig_pmd, WRITE)) {
> + if (write && !pmd_write(orig_pmd)) {
> ret = wp_huge_pmd(&vmf, orig_pmd);
> if (!(ret & VM_FAULT_FALLBACK))
> return ret;
>
>
> I still cannot make sense of what the intention behind these changes
> were, the Changelog that went with them is utter crap, it doesn't
> explain anything.

The motivation was that I noticed that get_user_pages_fast() was doing
a full pud_access_permitted() check, but the get_user_pages() slow
path was only doing a pud_write() check. That was inconsistent so I
went to go resolve that across all the pte types and ended up making a
mess of things, I'm fine if the answer is that we should have went the
other way to only do write checks. However, when I was investigating
which way to go the aspect that persuaded me to start sprinkling
p??_access_permitted checks around was that the application behavior
changed between mmap access and direct-i/o access to the same buffer.
I assumed that different access behavior between those would be an
inconsistent surprise to userspace. Although, infinitely looping in
handle_mm_fault is an even worse surprise, apologies for that.