Re: [PATCH] mm: reuse the unshared swapcache page in do_wp_page

From: David Hildenbrand
Date: Thu Jan 20 2022 - 15:09:54 EST


On 20.01.22 21:07, Matthew Wilcox wrote:
> On Thu, Jan 20, 2022 at 08:55:12PM +0100, David Hildenbrand wrote:
>>>>> David, does any of it regards the lru_cache_add() reference issue that I
>>>>> mentioned? [1]
>
>> +++ b/mm/memory.c
>> @@ -3291,19 +3291,28 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf)
>> if (PageAnon(vmf->page)) {
>> struct page *page = vmf->page;
>>
>> - /* PageKsm() doesn't necessarily raise the page refcount */
>> - if (PageKsm(page) || page_count(page) != 1)
>> + /*
>> + * PageKsm() doesn't necessarily raise the page refcount.
>> + *
>> + * These checks are racy as long as we haven't locked the page;
>> + * they are a pure optimization to avoid trying to lock the page
>> + * and trying to free the swap cache when there is little hope
>> + * it will actually result in a refcount of 1.
>> + */
>> + if (PageKsm(page) || page_count(page) > 1 + PageSwapCache(page))
>> goto copy;
>> if (!trylock_page(page))
>> goto copy;
>> - if (PageKsm(page) || page_mapcount(page) != 1 || page_count(page) != 1) {
>> + if (PageSwapCache(page))
>> + try_to_free_swap(page);
>> + if (PageKsm(page) || page_count(page) != 1) {
>> unlock_page(page);
>> goto copy;
>> }
>> /*
>> - * Ok, we've got the only map reference, and the only
>> - * page count reference, and the page is locked,
>> - * it's dark out, and we're wearing sunglasses. Hit it.
>> + * Ok, we've got the only page reference from our mapping
>> + * and the page is locked, it's dark out, and we're wearing
>> + * sunglasses. Hit it.
>> */
>> unlock_page(page);
>> wp_page_reuse(vmf);
>>
>>
>> I added some vmstats that monitor various paths. After one run of
>> ./forceswap 2 1000000 1
>> I'm left with a rough delta (including some noise) of
>> anon_wp_copy_count 1799
>> anon_wp_copy_count_early 1
>> anon_wp_copy_lock 983396
>> anon_wp_reuse 0
>>
>> The relevant part of your reproducer is
>>
>> for (i = 0; i < nops; i++) {
>> if (madvise((void *)p, PAGE_SIZE * npages, MADV_PAGEOUT)) {
>> perror("madvise");
>> exit(-1);
>> }
>>
>> for (j = 0; j < npages; j++) {
>> c = p[j * PAGE_SIZE];
>> c++;
>> time -= rdtscp();
>> p[j * PAGE_SIZE] = c;
>> time += rdtscp();
>> }
>> }
>>
>> For this specific reproducer at least, the page lock seems to be the thingy that prohibits
>> reuse if I interpret the numbers correctly. We pass the initial page_count() check.
>>
>> Haven't looked into the details, and I would be curious how that performs with actual
>> workloads, if we can reproduce similar behavior.
>
> I don't see how that patch addresses the lru issue. Wouldn't we need
> something like ...
>
> if (!PageLRU(page))
> lru_add_drain_all();
>

See my other reply "No, unfortunately not in that part of my work.".

Would the lru handling help here where we force swapout of a single
page, reuse code passes the "page_count(page) > 1 + PageSwapCache(page)"
check but fails to lock the page?

--
Thanks,

David / dhildenb