Re: [PATCH] khugepaged: Optimize __collapse_huge_page_copy_succeeded() for large folios by PTE batching

From: David Hildenbrand
Date: Wed Jun 18 2025 - 13:23:47 EST


On 18.06.25 19:10, Lorenzo Stoakes wrote:
On Wed, Jun 18, 2025 at 06:14:22PM +0200, David Hildenbrand wrote:
On 18.06.25 12:26, Dev Jain wrote:
+
/*
* ptl mostly unnecessary, but preempt has to
* be disabled to update the per-cpu stats
* inside folio_remove_rmap_pte().
*/
spin_lock(ptl);

Existing code: The PTL locking should just be moved outside of the loop.

Do we really want to hold the PTL for the duration of the loop? Are we sure
it's safe to do so? Are there any locks taken in other functions that might
sleep that'd mean holding a spinlock would be a problem?

It's a very weird thing to not hold the PTL while walking page tables, and then only grabbing it for clearing entries just to make selected functions happy ...

I mostly spotted the release_pte_folio(), which I think should be fine with a spinlock held. I missed the free_folio_and_swap_cache(), not sure if that is problematic.

Interestingly, release_pte_folio() does a

a) node_stat_mod_folio
b) folio_unlock
c) folio_putback_lru

... and folio_putback_lru() is documented to "lru_lock must not be held, interrupts must be enabled". Hmmmm. I suspect that doc is wrong.

So yeah, maybe better keep that weird looking locking like it is :)

--
Cheers,

David / dhildenb