Re: [PATCH] mm: reuse the unshared swapcache page in do_wp_page

From: David Hildenbrand
Date: Thu Jan 13 2022 - 12:25:38 EST


On 13.01.22 18:14, Linus Torvalds wrote:
> On Thu, Jan 13, 2022 at 8:48 AM David Hildenbrand <david@xxxxxxxxxx> wrote:
>>
>> I'm wondering if we can get rid of the mapcount checks in
>> reuse_swap_page() and instead check for page_count() and swapcount only.
>
> Honestly, I think even checking page_count() is pointless.
>
> If the page has users, then that's fine. That's irrelevant for whether
> it's a swap page or not, no?
>
> So if you want to remove it from the swap cache, all that matters is that
>
> (a) you have it locked so that there can be no new users trying to mix it up
>
> (b) there are no swapcount() users of this page (which don't have a
> ref to the page itself, they only have a swap entry), so that you
> can't have somebody trying to look it up (whether some racy
> "concurrent" lookup _or_ any later one, since we're about to remove
> the swap cache association).
>
> Why would "map_count()" matter - it's now many times the *page* is
> mapped, it's irrelevant to swap cache status? And for the same reason,
> what difference does "page_count()" have?
>
> One big reason I did the COW rewrite was literally that the rules were
> pure voodoo and made no sense at all. There was no underlying logic,
> it was just a random collection of random tests that didn't have any
> logical architecture to them.
>
> Our VM is really really complicated already, so I really want our code
> to make sense.
>
> So even if I'm entirely wrong in my swap/map-count arguments above,
> I'd like whatever patches in this area to be really well commented and
> have some fundamental rules and logic to them so that people can read
> the code and go "Ok, makes sense".
>
> Please?

I might be missing something, but it's not only about whether we can remove
the page from the swap cache, it's about whether we can reuse the page
exclusively in a process with write access, avoiding a COW. And for that we
have to check if it's mapped somewhere else already (readable).

Here is a sketch (buggy, untested, uncompiled) of how I think reuse_swap_page()
could look like, removing any mapcount logic and incorporating the
refount, leaving the page_trans_huge_swapcount() changes etc. out of the picture.

Would that make any sense?


bool reuse_swap_page(struct page *page, bool mapped)
{
int swapcount = 0, total_swapcount;

VM_BUG_ON_PAGE(!PageLocked(page), page);
if (unlikely(PageKsm(page)))
return false;

if (PageSwapCache(page)) {
swapcount = page_trans_huge_swapcount(page, &total_swapcount);

if (swapcount == 1 && !mapped &&
(likely(!PageTransCompound(page)) ||
/* The remaining swap count will be freed soon */
total_swapcount == page_swapcount(page))) {
if (!PageWriteback(page)) {
page = compound_head(page);
delete_from_swap_cache(page);
SetPageDirty(page);
} else {
swp_entry_t entry;
struct swap_info_struct *p;

entry.val = page_private(page);
p = swap_info_get(entry);
if (p->flags & SWP_STABLE_WRITES) {
spin_unlock(&p->lock);
return false;
}
spin_unlock(&p->lock);
}
}
}

/*
* We expect exactly one ref from our mapped page (if already mapped)
* and one ref from the swapcache (if in the swapcache).
*/
if (!mapped)
return swapcount == 1 && page_count(page) == !!PageSwapCache(page)
return swapcount == 0 && page_count(page) == 1 + !!PageSwapCache(page)
}


--
Thanks,

David / dhildenb