Re: [PATCH v1 4/6] dma-buf: Acquire wait-wound context on attachment

From: Christian König
Date: Wed Jul 20 2022 - 04:30:03 EST


Am 19.07.22 um 22:05 schrieb Dmitry Osipenko:
On 7/15/22 09:59, Dmitry Osipenko wrote:
On 7/15/22 09:50, Christian König wrote:
Am 15.07.22 um 02:52 schrieb Dmitry Osipenko:
Intel i915 GPU driver uses wait-wound mutex to lock multiple GEMs on the
attachment to the i915 dma-buf. In order to let all drivers utilize
shared
wait-wound context during attachment in a general way, make dma-buf
core to
acquire the ww context internally for the attachment operation and update
i915 driver to use the importer's ww context instead of the internal one.

 From now on all dma-buf exporters shall use the importer's ww context
for
the attachment operation.

Signed-off-by: Dmitry Osipenko <dmitry.osipenko@xxxxxxxxxxxxx>
---
  drivers/dma-buf/dma-buf.c                     |  8 +++++-
  drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c    |  2 +-
  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  2 +-
  drivers/gpu/drm/i915/gem/i915_gem_object.h    |  6 ++---
  drivers/gpu/drm/i915/i915_gem_evict.c         |  2 +-
  drivers/gpu/drm/i915/i915_gem_ww.c            | 26 +++++++++++++++----
  drivers/gpu/drm/i915/i915_gem_ww.h            | 15 +++++++++--
  7 files changed, 47 insertions(+), 14 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 0ee588276534..37545ecb845a 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -807,6 +807,8 @@ static struct sg_table * __map_dma_buf(struct
dma_buf_attachment *attach,
   * Optionally this calls &dma_buf_ops.attach to allow
device-specific attach
   * functionality.
   *
+ * Exporters shall use ww_ctx acquired by this function.
+ *
   * Returns:
   *
   * A pointer to newly created &dma_buf_attachment on success, or a
negative
@@ -822,6 +824,7 @@ dma_buf_dynamic_attach_unlocked(struct dma_buf
*dmabuf, struct device *dev,
                  void *importer_priv)
  {
      struct dma_buf_attachment *attach;
+    struct ww_acquire_ctx ww_ctx;
      int ret;
        if (WARN_ON(!dmabuf || !dev))
@@ -841,7 +844,8 @@ dma_buf_dynamic_attach_unlocked(struct dma_buf
*dmabuf, struct device *dev,
      attach->importer_ops = importer_ops;
      attach->importer_priv = importer_priv;
  -    dma_resv_lock(dmabuf->resv, NULL);
+    ww_acquire_init(&ww_ctx, &reservation_ww_class);
+    dma_resv_lock(dmabuf->resv, &ww_ctx);
That won't work like this. The core property of a WW context is that you
need to unwind all the locks and re-quire them with the contended one
first.

When you statically lock the imported one here you can't do that any more.
You're right. I felt that something is missing here, but couldn't
notice. I'll think more about this and enable
CONFIG_DEBUG_WW_MUTEX_SLOWPATH. Thank you!

Christian, do you think we could make an excuse for the attach()
callback and make the exporter responsible for taking the resv lock? It
will be inconsistent with the rest of the callbacks, where importer
takes the lock, but it will be the simplest and least invasive solution.
It's very messy to do a cross-driver ww locking, I don't think it's the
right approach.

So to summarize the following calls will require that the caller hold the resv lock:
1. dma_buf_pin()/dma_buf_unpin()
2. dma_buf_map_attachment()/dma_buf_unmap_attachment()
3. dma_buf_vmap()/dma_buf_vunmap()
4. dma_buf_move_notify()

The following calls require that caller does not held the resv lock:
1. dma_buf_attach()/dma_buf_dynamic_attach()/dma_buf_detach()
2. dma_buf_export()/dma_buf_fd()
3. dma_buf_get()/dma_buf_put()
4. dma_buf_begin_cpu_access()/dma_buf_end_cpu_access()

If that's correct than that would work for me as well, but we should probably document this.

Or let me ask the other way around: What calls exactly do you need to change to solve your original issue? That was vmap/vunmap, wasn't it? If yes then let's concentrate on those for the moment.

Regards,
Christian.