Re: [PATCH] async: fix __lowest_in_progress()

From: Tejun Heo
Date: Thu Jan 17 2013 - 13:50:44 EST


Hello,

On Thu, Jan 17, 2013 at 10:16:50AM -0800, Linus Torvalds wrote:
> Tejun, mind explaining this one a bit more to me?
>
> If ordering matters, and we have a running queue and a pending queue,
> how could the pending queue *ever* be lower than the running one?

So, before being converted to workqueue, async spooled up its own
workers and each worker would lock and move the first pending item to
the executing list and everything was in order.

The conversion to workqueue was done by adding work_struct to each
async_entry and async just schedules the work item. The queueing and
dispatching of such work items are still in order but now each worker
thread is associated with a specific async_entry and move that
specific async_entry to the executing list. So, depending on which
worker reaches that point earlier, which is completely
non-deterministic, we may end up moving an async_entry with larger
cookie before one with smaller one.

> That implies that something was taken off the pending queue and put on
> the running queue out of order, right?
>
> And that in turn implies that there isn't much of a "lowest" ordering
> at all, so how could anybody even care about what lowest is? It seems
> to be a meaningless measure.

The execution is still lowest first as workqueue would dispatch
workers in queued order. It just is that they can reach the
synchronization point at their own differing paces.

> So with that in mind, I don't see what semantics the first part of the
> patch fixes. Can you explain more?

The problem with the code is that it's keeping a global pending list
and domain-specific running lists. I don't know why it developed like
this but even before workqueue conversion the code was weird.

* It has sorted per-domain running list, so looking at the first item
is easy.

* It has sorted global pennding list, and looking for first item in a
domain involves scanning it.

Global syncing ends up scanning all per-domain running lists and
domain syncing ends up scanning global pending list, when all we need
is each async item to be queued on two lists - global and per-domain
in-flight lists - and stay there until done.

The posted patch is minimal fix while keeping the basic operation the
same so that it doesn't disturb -stable too much. I'll prep a patch
to redo synchronization for 3.9.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/