Re: [PATCH] mm: reuse the unshared swapcache page in do_wp_page

From: Linus Torvalds
Date: Thu Jan 20 2022 - 10:37:34 EST


On Thu, Jan 20, 2022 at 5:26 PM David Hildenbrand <david@xxxxxxxxxx> wrote:
>
> So your claim is that read-only, PTE mapped pages are weird? How do you
> come to that conclusion?
>
> If we adjust the THP reuse logic to split on additional references
> (page_count() == 1) -- similarly as suggested by Linus to fix the CVE --
> we're going to end up with exactly that more frequently.

If you write to a THP page that has page_count() elevated - presumably
because of a fork() - then that COW is exactly what you want to
happen.

And presumably you want it to happen page-by-page.

So I fail to see what the problem is.

The *normal* THP case is that there hasn't been a fork, and there is
no COW activity. That's the only thing worth trying to optimize for
and worry about.

If you do some kind of fork with huge-pages, and actually write to
those pages (as opposed to just execve() in the child and wait in the
parent), you only have yourself to blame. You *will* take COW faults,
and you have to do it, and at that point spliting the THP in whoever
did the COW is likely the right thing to do just to hope that you
don't have to allocate a whole new hugepage. So it will cause new
(small-page) allocations and copies.

And yes, at that point, writes to the THP page will cause COW's for
both sides as they both end up making that "split it" decision.

Honestly, would anything else ever even make sense?

If you care about THP performance, you make sure that the COW THP case
simply never happens.

Linus