Re: [PATCH 5/5] i915: fence workqueue optimization

From: Andrea Arcangeli
Date: Fri Apr 07 2017 - 09:14:12 EST


On Fri, Apr 07, 2017 at 10:58:38AM +0100, Chris Wilson wrote:
> On Fri, Apr 07, 2017 at 01:23:47AM +0200, Andrea Arcangeli wrote:
> > Insist to run llist_del_all() until the free_list is found empty, this
> > may avoid having to schedule more workqueues.
>
> The work will already be scheduled (everytime we add the first element,
> the work is scheduled, and the scheduled bit is cleared before the work
> is executed). So we aren't saving the kworker from having to process
> another work, but we may make that having nothing to do. The question is
> whether we want to trap the kworker here, and presumably you will also want
> to add a cond_resched() between passes.

Yes it is somewhat dubious in the two event only case, but it will
save kworker in case of more events if there is a flood of
llist_add. It just looked fast enough but it's up to you, it's a
cmpxchg more for each intel_atomic_helper_free_state. If it's unlikely
more work is added, it's better to drop it. Agree about
cond_resched() if we keep it.

The same issue exists in __i915_gem_free_work, but I guess it's more
likely there that by the time __i915_gem_free_objects returns the
free_list isn't empty anymore because __i915_gem_free_objects has a
longer runtime but then you may want to re-evaluate that too as it's
slower for the two llist_add in a row case and only pays off from the
third.

while ((freed = llist_del_all(&i915->mm.free_list)))
__i915_gem_free_objects(i915, freed);