Re: [PATCH v3 02/12] iommu: Add iommu_split_block interface

From: Lu Baolu
Date: Wed Apr 14 2021 - 03:24:49 EST


On 4/13/21 4:54 PM, Keqian Zhu wrote:
Block(largepage) mapping is not a proper granule for dirty log tracking.
Take an extreme example, if DMA writes one byte, under 1G mapping, the
dirty amount reported is 1G, but under 4K mapping, the dirty amount is
just 4K.

This adds a new interface named iommu_split_block in IOMMU base layer.
A specific IOMMU driver can invoke it during start dirty log. If so, the
driver also need to realize the split_block iommu ops.

We flush all iotlbs after the whole procedure is completed to ease the
pressure of IOMMU, as we will hanle a huge range of mapping in general.

Signed-off-by: Keqian Zhu <zhukeqian1@xxxxxxxxxx>
Signed-off-by: Kunkun Jiang <jiangkunkun@xxxxxxxxxx>
---
drivers/iommu/iommu.c | 41 +++++++++++++++++++++++++++++++++++++++++
include/linux/iommu.h | 11 +++++++++++
2 files changed, 52 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 667b2d6d2fc0..bb413a927870 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2721,6 +2721,47 @@ int iommu_domain_set_attr(struct iommu_domain *domain,
}
EXPORT_SYMBOL_GPL(iommu_domain_set_attr);
+int iommu_split_block(struct iommu_domain *domain, unsigned long iova,
+ size_t size)
+{
+ const struct iommu_ops *ops = domain->ops;
+ unsigned int min_pagesz;
+ size_t pgsize;
+ bool flush = false;
+ int ret = 0;
+
+ if (unlikely(!ops || !ops->split_block))
+ return -ENODEV;
+
+ min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
+ if (!IS_ALIGNED(iova | size, min_pagesz)) {
+ pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n",
+ iova, size, min_pagesz);
+ return -EINVAL;
+ }
+
+ while (size) {
+ flush = true;
+
+ pgsize = iommu_pgsize(domain, iova, size);
+
+ ret = ops->split_block(domain, iova, pgsize);
+ if (ret)
+ break;
+
+ pr_debug("split handled: iova 0x%lx size 0x%zx\n", iova, pgsize);
+
+ iova += pgsize;
+ size -= pgsize;
+ }
+
+ if (flush)
+ iommu_flush_iotlb_all(domain);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_split_block);

Do you really have any consumers of this interface other than the dirty
bit tracking? If not, I don't suggest to make this as a generic IOMMU
interface.

There is an implicit requirement for such interfaces. The
iommu_map/unmap(iova, size) shouldn't be called at the same time.
Currently there's no such sanity check in the iommu core. A poorly
written driver could mess up the kernel by misusing this interface.

This also applies to iommu_merge_page().

Best regards,
baolu