On Tue, 17 Jun 2025 12:22:10 -0300, jgg@xxxxxxxx wrote:
Weird, but I would not expect this as a general rule, not sure we
should rely on it.
I would say exported function should not get automatically
inlined. That throws all the kprobes into chaos :\
BTW, why can't the other patches in this series just use
unpin_user_page_range_dirty_lock? The way this stuff is supposed to
work is to combine adjacent physical addresses and then invoke
unpin_user_page_range_dirty_lock() on the start page of the physical
range. This is why we have the gup_folio_range_next() which does the
segmentation in an efficient way.
Combining adjacent physical is basically free math.
Segmenting to folios in the vfio side doesn't make a lot of sense,
IMHO.
drivers/vfio/vfio_iommu_type1.c | 35 +++++++++++++++++++++++++++++----
1 file changed, 31 insertions(+), 4 deletions(-)
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index e952bf8bdfab..159ba80082a8 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -806,11 +806,38 @@ static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova,
bool do_accounting)
{
long unlocked = 0, locked = vpfn_pages(dma, iova, npage);
- long i;
- for (i = 0; i < npage; i++)
- if (put_pfn(pfn++, dma->prot))
- unlocked++;
+ while (npage) {
+ long nr_pages = 1;
+
+ if (!is_invalid_reserved_pfn(pfn)) {
+ struct page *page = pfn_to_page(pfn);
+ struct folio *folio = page_folio(page);
+ long folio_pages_num = folio_nr_pages(folio);
+
+ /*
+ * For a folio, it represents a physically
+ * contiguous set of bytes, and all of its pages
+ * share the same invalid/reserved state.
+ *
+ * Here, our PFNs are contiguous. Therefore, if we
+ * detect that the current PFN belongs to a large
+ * folio, we can batch the operations for the next
+ * nr_pages PFNs.
+ */
+ if (folio_pages_num > 1)
+ nr_pages = min_t(long, npage,
+ folio_pages_num -
+ folio_page_idx(folio, page));
+
+ unpin_user_folio_dirty_locked(folio, nr_pages,
+ dma->prot & IOMMU_WRITE);
Are you suggesting that we should directly call
unpin_user_page_range_dirty_lock() here (patch 3/3) instead?
BTW, it appears that implementing unpin_user_folio_dirty_locked()
as an inline function may not be viable for vfio, given that
gup_put_folio() is not exported.