Re: [PATCH v3 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count.

From: Lorenzo Stoakes
Date: Thu Jul 17 2025 - 10:33:34 EST


On Mon, Jul 14, 2025 at 01:18:23PM -0400, Zi Yan wrote:
> Instead of open coding the ref_count calculation, use
> folio_expected_ref_count().

You really should put something here about why it is that the open-coded
value and the value returned from folio_expected_ref_count() would be
expected to be the same. See comment below inline with code.

>
> Suggested-by: David Hildenbrand <david@xxxxxxxxxx>
> Signed-off-by: Zi Yan <ziy@xxxxxxxxxx>
> Acked-by: Balbir Singh <balbirs@xxxxxxxxxx>
> Acked-by: David Hildenbrand <david@xxxxxxxxxx>

Ah haha you're literally addresing some of my code review here from the
last patch :) I love it when that happens :P

I'd like you to improve the commit message, but that's a nit so:

Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx>

See below for some analysis of the folio_expected_ref_count().

> ---
> mm/huge_memory.c | 12 +++++-------
> 1 file changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index a7ee731f974f..31b5c4e61a57 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3735,6 +3735,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
> if (folio_ref_freeze(folio, 1 + extra_pins)) {
> struct address_space *swap_cache = NULL;
> struct lruvec *lruvec;
> + int expected_refs;
>
> if (folio_order(folio) > 1 &&
> !list_empty(&folio->_deferred_list)) {
> @@ -3805,11 +3806,8 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
> new_folio = next) {
> next = folio_next(new_folio);
>
> - folio_ref_unfreeze(
> - new_folio,
> - 1 + ((mapping || swap_cache) ?
> - folio_nr_pages(new_folio) :
> - 0));
> + expected_refs = folio_expected_ref_count(new_folio) + 1;


So digging in:

static inline int folio_expected_ref_count(const struct folio *folio)
{
const int order = folio_order(folio);
int ref_count = 0;

...

if (folio_test_anon(folio)) {
/* One reference per page from the swapcache. */
ref_count += folio_test_swapcache(folio) << order;
} else {
/* One reference per page from the pagecache. */
ref_count += !!folio->mapping << order;

^---- these are covered off by (mapping || swap_cache) ? folio_nr_pages(folio)

/* One reference from PG_private. */
ref_count += folio_test_private(folio);

This one is trickier.

OK so looking through the logic, the can_split_folio() function will
already assert that the only pins you have are the swapcache/page cache
ones on the 'origin' folio (the mapcount bit used in the freeze doesn't matter
as you're dealing with split, not-yet-mapped 'sub'-folios).

So this precludes an elevated refcount from PG_private, therefore this will
naturally be 0.

}

/* One reference per page table mapping. */
return ref_count + folio_mapcount(folio);

folio_mapcount() will be zero for these split folios, until remapped.

}

You add the + 1 to account for the folio pin of course.

TL;DR - this is correct AFAICT.


> + folio_ref_unfreeze(new_folio, expected_refs);
>
> lru_add_split_folio(folio, new_folio, lruvec, list);
>
> @@ -3839,8 +3837,8 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
> * Otherwise, a parallel folio_try_get() can grab origin_folio
> * and its caller can see stale page cache entries.
> */
> - folio_ref_unfreeze(folio, 1 +
> - ((mapping || swap_cache) ? folio_nr_pages(folio) : 0));
> + expected_refs = folio_expected_ref_count(folio) + 1;
> + folio_ref_unfreeze(folio, expected_refs);
>
> unlock_page_lruvec(lruvec);
>
> --
> 2.47.2
>