Re: [PATCH v2 2/2] iommu/iova: Manage the depot list size

From: Robin Murphy
Date: Thu Sep 07 2023 - 19:28:39 EST


On 2023-09-07 18:54, Srivastava, Dheeraj Kumar wrote:
Hi Robin,

On 8/21/2023 11:52 PM, Robin Murphy wrote:
Automatically scaling the depot up to suit the peak capacity of a
workload is all well and good, but it would be nice to have a way to
scale it back down again if the workload changes. To that end, add
backround reclaim that will gradually free surplus magazines if the

[ bah, I'll have to fix that typo too - thanks for the squiggle, Thunderbird ]

[...]
Looking into the trace and your patch figured out that in iova_depot_work_func workqueue function rcache->lock is taken via spin_lock. But the same lock will be taken from IRQ context also.
So, to prevent IRQ when the rcache->lock is taken we should disable IRQ.
Therefore use spin_lock_irqsave in place of normal spin_lock.

Oof, indeed it seems I totally failed to consider IRQs... this tweak looks right to me, thanks for the catch! I'll get a v3 ready for -rc1...

Cheers,
Robin.

Below changes fixes the issue

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 436f42855c29..d30e453d0fb4 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -747,11 +747,12 @@ static void iova_depot_work_func(struct work_struct *work)
 {
    struct iova_rcache *rcache = container_of(work, typeof(*rcache), work.work);
    struct iova_magazine *mag = NULL;
+    unsigned long flags;

-    spin_lock(&rcache->lock);
+    spin_lock_irqsave(&rcache->lock, flags);
    if (rcache->depot_size > num_online_cpus())
        mag = iova_depot_pop(rcache);
-    spin_unlock(&rcache->lock);
+    spin_unlock_irqrestore(&rcache->lock, flags);

    if (mag) {
        iova_magazine_free_pfns(mag, rcache->iovad);


+    if (mag) {
+        iova_magazine_free_pfns(mag, rcache->iovad);
+        iova_magazine_free(mag);
+        schedule_delayed_work(&rcache->work, IOVA_DEPOT_DELAY);
+    }
  }
  int iova_domain_init_rcaches(struct iova_domain *iovad)
@@ -752,6 +777,8 @@ int iova_domain_init_rcaches(struct iova_domain *iovad)
          rcache = &iovad->rcaches[i];
          spin_lock_init(&rcache->lock);
+        rcache->iovad = iovad;
+        INIT_DELAYED_WORK(&rcache->work, iova_depot_work_func);
          rcache->cpu_rcaches = __alloc_percpu(sizeof(*cpu_rcache),
                               cache_line_size());
          if (!rcache->cpu_rcaches) {
@@ -812,6 +839,7 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
              spin_lock(&rcache->lock);
              iova_depot_push(rcache, cpu_rcache->loaded);
              spin_unlock(&rcache->lock);
+            schedule_delayed_work(&rcache->work, IOVA_DEPOT_DELAY);
              cpu_rcache->loaded = new_mag;
              can_insert = true;
@@ -912,6 +940,7 @@ static void free_iova_rcaches(struct iova_domain *iovad)
              iova_magazine_free(cpu_rcache->prev);
          }
          free_percpu(rcache->cpu_rcaches);
+        cancel_delayed_work_sync(&rcache->work);
          while (rcache->depot)
              iova_magazine_free(iova_depot_pop(rcache));
      }