Re: [PATCH v2 01/17] mm/gup: Fixup p*_access_permitted()

From: Peter Zijlstra
Date: Thu Dec 14 2017 - 09:38:02 EST


On Thu, Dec 14, 2017 at 01:41:17PM +0100, Peter Zijlstra wrote:
> On Thu, Dec 14, 2017 at 12:27:27PM +0100, Peter Zijlstra wrote:
> > The gup_*_range() functions which implement __get_user_pages_fast() do
> > a p*_access_permitted() test to see if the memory is at all accessible
> > (tests both _PAGE_USER|_PAGE_RW as well as architectural things like
> > pkeys).
> >
> > But the follow_*() functions which implement __get_user_pages() do not
> > have this test. Recently, commit:
> >
> > 5c9d2d5c269c ("mm: replace pte_write with pte_access_permitted in fault + gup paths")
> >
> > added it to a few specific write paths, but it failed to consistently
> > apply it (I've not audited anything outside of gup).
> >
> > Revert the change from that patch and insert the tests in the right
> > locations such that they cover all READ / WRITE accesses for all
> > pte/pmd/pud levels.
> >
> > In particular I care about the _PAGE_USER test, we should not ever,
> > allow access to pages not marked with it, but it also makes the pkey
> > accesses more consistent.
>
> This should probably go on top. These are now all superfluous and
> slightly wrong.

I also cannot explain dax_mapping_entry_mkclean(), why would we not make
clean those pages that are not pkey writable (but clearly are writable
and dirty)? That doesn't make any sense at all.

Kirill did point out that my patch(es) break FOLL_DUMP in that it would
now exclude pkey protected pages from core-dumps.

My counter argument is that it will now properly exclude !_PAGE_USER
pages.

If we change p??_access_permitted() to pass the full follow flags
instead of just the write part we could fix that.

I'm also looking at pte_access_permitted() in handle_pte_fault(); that
looks very dodgy to me. How does that not result in endlessly CoW'ing
the same page over and over when we have a PKEY disallowing write access
on that page?

Bah... /me grumpy