Re: [PATCH v3] mm/gup: Allow real explicit breaking of COW

From: Jan Kara
Date: Fri Aug 21 2020 - 11:48:03 EST


On Fri 21-08-20 05:27:40, Linus Torvalds wrote:
> On Fri, Aug 21, 2020 at 3:13 AM Jan Kara <jack@xxxxxxx> wrote:
> >
> > > + if (page_mapcount(page) != 1 && page_count(page) != 1) {
> >
> > So this condition looks strange to me... Did you mean:
> >
> > if (page_mapcount(page) != 1 || page_count(page) != 1)
>
> Duh. Yes.
>
> > > - if (PageKsm(vmf->page)) {
> >
> > Also I know nothing about KSM but looking at reuse_ksm_page() I can see it
> > plays some tricks with page index & mapping even for pages with page_count
> > == 1 so you cannot just drop those bits AFAICT.
>
> Yeah, I wasn't really sure what we want to do.
>
> In an optimal world, I was thinking that we'd actually do exactly what
> we do at munmap time.
>
> Which is not to get the page lock at all. Just look at what
> zap_pte_range() does for an a page when it unmaps it:
>
> page_remove_rmap(page, false);
>
> and that's it. No games.
>
> And guess what? That "'page_remove_rmap()" is what wp_page_copy() already
> does.

I was more concerned about the case where you decide to writeably map (i.e.
wp_page_reuse() path) a PageKsm() page. That path does not touch
page->mapping in your code AFAICS. And AFAIU the code in mm/ksm.c you are
not supposed to writeably map PageKsm() pages without changing
page->mapping (which also effectively makes PageKsm() return false) but I
don't see anything in your code that would achieve that because KSM code
references a page without being accounted in page_count() for $reasons (see
comment before get_ksm_page()) and instead plays tricks with validating
cookies in page->mapping...

> So I really think *all* of these games we play are complete garbage
> and completely wrong.
>
> Because the zap_page_range() path is a *lot* more common than the WP
> path, and triggers for every single page when we do munmap or exit or
> whatever.
>
> So why would WP need to do anything else for correctness? Absolutely
> no reason I can see.
>
> > Also I'm not sure if dropping this is safe for THP - reuse_swap_page()
> > seems to be a misnomer and seems to do also some THP handling.
>
> Again, I think that's a bogus argument.
>
> Because this all is actually not the common path at all, and the thing
> is, the common path does none of these odd games.
>
> I really think this COW handling magic is just legacy garbage because
> people have carried it along forever and everybody is worried about
> it. The fact is, the "copy" case is always safe, because all it does
> is basically the same as zap_page_range() does, with just adding a new
> page instead.

And also here I was more concerned that page_mapcount != 1 || page_count !=
1 check could be actually a weaker check than what reuse_swap_page() does.
So the old code could decide to copy while your new code would decide to go
the wp_page_reuse() path. And for this case I don't see how your "but unmap
path is simple" argument would apply...

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR