Re: [Intel-gfx] [PATCH 6/6] treewide: remove check of list iterator against head past the loop body

From: Tvrtko Ursulin
Date: Wed Mar 02 2022 - 12:15:14 EST



On 28/02/2022 11:08, Jakob Koschel wrote:
When list_for_each_entry() completes the iteration over the whole list
without breaking the loop, the iterator value will be a bogus pointer
computed based on the head element.

While it is safe to use the pointer to determine if it was computed
based on the head element, either with list_entry_is_head() or
&pos->member == head, using the iterator variable after the loop should
be avoided.

In preparation to limiting the scope of a list iterator to the list
traversal loop, use a dedicated pointer to point to the found element.

Signed-off-by: Jakob Koschel <jakobkoschel@xxxxxxxxx>

[snip until i915 parts]

drivers/gpu/drm/i915/gem/i915_gem_context.c | 14 +++---
.../gpu/drm/i915/gem/i915_gem_execbuffer.c | 15 ++++---
drivers/gpu/drm/i915/gt/intel_ring.c | 15 ++++---

[snip]

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 00327b750fbb..80c79028901a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -107,25 +107,27 @@ static void lut_close(struct i915_gem_context *ctx)
radix_tree_for_each_slot(slot, &ctx->handles_vma, &iter, 0) {
struct i915_vma *vma = rcu_dereference_raw(*slot);
struct drm_i915_gem_object *obj = vma->obj;
- struct i915_lut_handle *lut;
+ struct i915_lut_handle *lut = NULL;
+ struct i915_lut_handle *tmp;

if (!kref_get_unless_zero(&obj->base.refcount))
continue;

spin_lock(&obj->lut_lock);
- list_for_each_entry(lut, &obj->lut_list, obj_link) {
- if (lut->ctx != ctx)
+ list_for_each_entry(tmp, &obj->lut_list, obj_link) {
+ if (tmp->ctx != ctx)
continue;

- if (lut->handle != iter.index)
+ if (tmp->handle != iter.index)
continue;

- list_del(&lut->obj_link);
+ list_del(&tmp->obj_link);
+ lut = tmp;
break;
}
spin_unlock(&obj->lut_lock);

- if (&lut->obj_link != &obj->lut_list) {
+ if (lut) {
i915_lut_handle_free(lut);
radix_tree_iter_delete(&ctx->handles_vma, &iter, slot);

Looks okay although personally I would have left lut as is for a smaller diff and introduced a new local like 'found' or 'unlinked'.

i915_vma_close(vma);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 1736efa43339..fda9e3685ad2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2444,7 +2444,8 @@ static struct i915_request *eb_throttle(struct i915_execbuffer *eb, struct intel
{
struct intel_ring *ring = ce->ring;
struct intel_timeline *tl = ce->timeline;
- struct i915_request *rq;
+ struct i915_request *rq = NULL;
+ struct i915_request *tmp;

/*
* Completely unscientific finger-in-the-air estimates for suitable
@@ -2460,15 +2461,17 @@ static struct i915_request *eb_throttle(struct i915_execbuffer *eb, struct intel
* claiming our resources, but not so long that the ring completely
* drains before we can submit our next request.
*/
- list_for_each_entry(rq, &tl->requests, link) {
- if (rq->ring != ring)
+ list_for_each_entry(tmp, &tl->requests, link) {
+ if (tmp->ring != ring)
continue;

- if (__intel_ring_space(rq->postfix,
- ring->emit, ring->size) > ring->size / 2)
+ if (__intel_ring_space(tmp->postfix,
+ ring->emit, ring->size) > ring->size / 2) {
+ rq = tmp;
break;
+ }
}
- if (&rq->link == &tl->requests)
+ if (!rq)
return NULL; /* weird, we will check again later for real */

Alternatively, instead of break could simply do "return i915_request_get(rq);" and replace the end of the function after the loop with "return NULL;". A bit smaller diff, or at least less "spread out" over the function, so might be easier to backport stuff touching this area in the future. But looks correct as is.


return i915_request_get(rq);
diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
index 2fdd52b62092..4881c4e0c407 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring.c
@@ -191,24 +191,27 @@ wait_for_space(struct intel_ring *ring,
struct intel_timeline *tl,
unsigned int bytes)
{
- struct i915_request *target;
+ struct i915_request *target = NULL;
+ struct i915_request *tmp;
long timeout;

if (intel_ring_update_space(ring) >= bytes)
return 0;

GEM_BUG_ON(list_empty(&tl->requests));
- list_for_each_entry(target, &tl->requests, link) {
- if (target->ring != ring)
+ list_for_each_entry(tmp, &tl->requests, link) {
+ if (tmp->ring != ring)
continue;

/* Would completion of this request free enough space? */
- if (bytes <= __intel_ring_space(target->postfix,
- ring->emit, ring->size))
+ if (bytes <= __intel_ring_space(tmp->postfix,
+ ring->emit, ring->size)) {
+ target = tmp;
break;
+ }
}

- if (GEM_WARN_ON(&target->link == &tl->requests))
+ if (GEM_WARN_ON(!target))
return -ENOSPC;

timeout = i915_request_wait(target,

Looks okay as well. Less clear here if there is a clean solution to make the diff smaller so no suggestions. I mean do I dare mention "goto found;" from inside the loop, where the break is, instead of the variable renames.. risky.. :) (And ofc "return -ENOSPC" immediately after the loop.)

As a summary changes looks okay, up to you if you want to try to make the diffs smaller or not. It doesn't matter hugely really, all I have is a vague and uncertain "maybe it makes backporting of something, someday easier". So for i915 it is good either way.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> # i915 bits only

Regards,

Tvrtko