Re: [PATCH RFC 34/39] mm/rmap: introduce folio_try_dup_anon_rmap_[pte|ptes|pmd]()

From: David Hildenbrand
Date: Tue Dec 05 2023 - 09:12:25 EST


On 05.12.23 15:02, Ryan Roberts wrote:
On 05/12/2023 13:50, David Hildenbrand wrote:
On 05.12.23 14:40, Ryan Roberts wrote:
On 05/12/2023 13:18, David Hildenbrand wrote:
On 05.12.23 14:17, David Hildenbrand wrote:
On 05.12.23 14:12, Ryan Roberts wrote:
On 04/12/2023 14:21, David Hildenbrand wrote:
The last user of page_needs_cow_for_dma() and __page_dup_rmap() are gone,
remove them.

Add folio_try_dup_anon_rmap_ptes() right away, we want to perform rmap
baching during fork() soon.

Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>
---
    include/linux/mm.h   |   6 --
    include/linux/rmap.h | 145 +++++++++++++++++++++++++++++--------------
    2 files changed, 100 insertions(+), 51 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 24c1c7c5a99c0..f7565b35ae931 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1964,12 +1964,6 @@ static inline bool folio_needs_cow_for_dma(struct
vm_area_struct *vma,
        return folio_maybe_dma_pinned(folio);
    }
    -static inline bool page_needs_cow_for_dma(struct vm_area_struct *vma,
-                      struct page *page)
-{
-    return folio_needs_cow_for_dma(vma, page_folio(page));
-}
-
    /**
     * is_zero_page - Query if a page is a zero page
     * @page: The page to query
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 21d72cc602adc..84439f7720c62 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -354,68 +354,123 @@ static inline void folio_dup_file_rmap_pmd(struct
folio *folio,
    #endif
    }
    -static inline void __page_dup_rmap(struct page *page, bool compound)
+static inline int __folio_try_dup_anon_rmap(struct folio *folio,

__always_inline?

Yes.

Ah, no, I did this for a reason. This function lives in a header, so it will
always be inlined.


Really? It will certainly be duplicated across every compilation unit, but
that's separate from being inlined - if the optimizer is off, won't it just end
up as an out-of-line function in every compilation unit?

Good point, I didn't really consider that here, and thinking about it it makes
perfect sense.

I think the compiler might even ignore "always_inline". I read that especially
with recursion the compiler might ignore that. But people can then complain to
the compiler writers about performance issues here, we told the compiler what we
think is best.


To be honest, my comment assumed that you had a good reason for using
__always_inline, and in that case then you should be consistent. But if you
don't have a good reason, you should probably just use inline and let the
compiler do what it thinks best?

I think __always_inline is the right thing to do here, we really want the compiler to generate specialized code. I was just somehow ignoring the scenario you described :)

__always_inline it is.

--
Cheers,

David / dhildenb