Re: [PATCH mm-new 2/2] mm/shmem: writeout free swap if swap_writeout() reactivates

From: Baolin Wang
Date: Sat Jul 19 2025 - 00:33:05 EST




On 2025/7/19 08:51, Hugh Dickins wrote:
On Thu, 17 Jul 2025, Baolin Wang wrote:

Hi Hugh,

On 2025/7/16 16:08, Hugh Dickins wrote:
If swap_writeout() returns AOP_WRITEPAGE_ACTIVATE (for example, because
zswap cannot compress and memcg disables writeback), there is no virtue
in keeping that folio in swap cache and holding the swap allocation:
shmem_writeout() switch it back to shmem page cache before returning.

Folio lock is held, and folio->memcg_data remains set throughout, so
there is no need to get into any memcg or memsw charge complications:
swap_free_nr() and delete_from_swap_cache() do as much as is needed (but
beware the race with shmem_free_swap() when inode truncated or evicted).

Doing the same for an anonymous folio is harder, since it will usually
have been unmapped, with references to the swap left in the page tables.
Adding a function to remap the folio would be fun, but not worthwhile
unless it has other uses, or an urgent bug with anon is demonstrated.

Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
---
mm/shmem.c | 33 ++++++++++++++++++++++++++++++++-
1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 33675361031b..5a7ce4c8bad6 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1655,6 +1655,7 @@ int shmem_writeout(struct folio *folio, struct
swap_iocb **plug,
if (!folio_alloc_swap(folio, __GFP_HIGH | __GFP_NOMEMALLOC |
__GFP_NOWARN)) {
bool first_swapped = shmem_recalc_inode(inode, 0, nr_pages);
+ int error;
/*
* Add inode to shmem_unuse()'s list of swapped-out inodes,
@@ -1675,7 +1676,37 @@ int shmem_writeout(struct folio *folio, struct
swap_iocb **plug,
shmem_delete_from_page_cache(folio, swp_to_radix_entry(folio->swap));
BUG_ON(folio_mapped(folio));
- return swap_writeout(folio, plug);
+ error = swap_writeout(folio, plug);
+ if (error != AOP_WRITEPAGE_ACTIVATE) {
+ /* folio has been unlocked */
+ return error;
+ }
+
+ /*
+ * The intention here is to avoid holding on to the swap when
+ * zswap was unable to compress and unable to writeback; but
+ * it will be appropriate if other reactivate cases are added.
+ */
+ error = shmem_add_to_page_cache(folio, mapping, index,
+ swp_to_radix_entry(folio->swap),
+ __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN);
+ /* Swap entry might be erased by racing shmem_free_swap() */
+ if (!error) {
+ spin_lock(&info->lock);
+ info->swapped -= nr_pages;
+ spin_unlock(&info->lock);

Using the helper 'shmem_recalc_inode(inode, 0, -nr_pages)' seems more
readable?

Yes, that's better, thanks: I don't know if I'd say "more readable",
but it is much more in the spirit of shmem_recalc_inode(), bringing
the counts into balance sooner rather than later.

I'll follow up with a "fix" patch to Andrew.


+ swap_free_nr(folio->swap, nr_pages);
+ }
+
+ /*
+ * The delete_from_swap_cache() below could be left for
+ * shrink_folio_list()'s folio_free_swap() to dispose of;
+ * but I'm a little nervous about letting this folio out of
+ * shmem_writeout() in a hybrid half-tmpfs-half-swap state
+ * e.g. folio_mapping(folio) might give an unexpected answer.
+ */
+ delete_from_swap_cache(folio);

IIUC, Should the delete_from_swap_cache() also be moved into the 'if (!error)'
branch? Since if shmem_free_swap() has freed the swap entry, it would also
reclaim the swap cache, no?

No, but it was a good point to raise, and led into more research than
I had anticipated.

No: because shmem_free_swap->free_swap_and_cache_nr->__try_to_reclaim_swap
has to return after doing nothing if its folio_trylock fails: it cannot do
the delete_from_swap_cache() part of the job, which we do here - on this
AOP_WRITEPAGE_ACTIVATE path, we hold the folio_lock throughout.

I missed the 'folio_trylock', yes, you are right. Thanks for explanation.

But it led into more research, because I wanted to point you to the
equivalent coding in shmem_swapin_folio(): but, to my initial alarm,
the equivalent is not there; but used to be.

See 5.8 commit 14235ab36019 ("mm: shmem: remove rare optimization when
swapin races with hole punching"). There (in the deleted lines) you can
see the helpful comment on this case, with its delete_from_swap_cache()
when shmem_add_to_page_cache() fails. But for memcg-charging reasons,
5.8 found it simpler to drop that, and just let shrink_page_list()
clear up the debris later.

Here in shmem_writeout(), holding folio_lock throughout, we have no
memcg complications, and can go ahead with delete_from_swap_cache(),
both when successfully added back to page cache, and when that fails.

OK. Thanks for pointing out the change history here, and I have no further questions.

With your following changes, feel free to add:

Reviewed-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>