Re: [PATCH 2/5] mm,hugetlb: Document the reason to lock the folio in the faulting path

From: Oscar Salvador
Date: Fri Jun 13 2025 - 17:48:04 EST


On Fri, Jun 13, 2025 at 09:57:23PM +0200, David Hildenbrand wrote:
> What I meant is:
>
> Assume we have a pagecache page mapped into our page tables R/O (MAP_PRIVATE
> mapping).
>
> During a write fault on such a pagecache page, we end up in
> do_wp_page()->wp_page_copy() we perform the copy via __wp_page_copy_user()
> without the folio lock.

Yes, it would be similar to doing

hugetlb_fault()->hugetlb_no_page() which would map it R/O.
Then, if we write to it, we will go to hugetlb_wp().
Since it is a private mapping, we would only need to lock the folio to
see if we can re-use it (the wp_can_reuse_anon_folio() analog to
hugetlb).

> In wp_page_copy(), we retake the pt lock, to make sure that the page is
> still mapped (pte_same). If the page is no longer mapped, we retry the
> fault.
>
> In that case, we only want to make sure that the folio is still mapped after
> possibly dropping the page table lock in between.
>
> As we are holding an additional folio reference in
> do_wp_page()->wp_page_copy(), the folio cannot get freed concurrently.
>
>
> There is indeed the do_cow_fault() path where we avoid faulting in the
> pagecache page in the first place. So no page table reference, an I can
> understand why we would need the folio lock there.

But do_cow_fault() does take a reference via __do_fault()->filemap_fault().

> Regarding hugetlb_no_page(): I think we could drop the folio lock for a
> pagecache folio after inserting the folio into the page table. Just like
> do_wp_page()->wp_page_copy(), we would have to verify again under PTL if the
> folio is still mapped
>
> ... which we already do through pte_same() checks?

But that is somewhat similar what we do in the generic faulting path.

Assume you fault in a file for a private mapping and do COW.
So, do_pte_missing()->do_fault()->do_cow_fault().

do_cow_fault()->__do_fault() will a) get a reference and b) lock the folio.
And then we will proceed with copying the file to the page we will map privately.

This would be something like hugetlb_fault()->hugetlb_no_page()->hugetlb_wp().
So we have to hold the lock throughout hugetlb_wp() for file pages we are copying
to private mappings.

Now, let us assume you map the file R/O. And after a while you write-fault to it.
In the generic faulting path, that will go through:

do_pte_missing()->do_fault()->do_read_fault()
do_wp_page()->wp_page_copy()

wp_page_copy(), which indeed doesn't hold the lock (but takes a reference).

Maybe it's because it's Friday, but I'm confused as to why
do_pte_missing()->do_fault()->do_cow_fault() holds the lock while do_wp_page() doesn't
although it might the file's page we have to copy.



--
Oscar Salvador
SUSE Labs