Re: [PATCH 2/3] workqueue: not allow recursion run_workqueue

From: Oleg Nesterov
Date: Thu Feb 05 2009 - 12:05:45 EST


On 02/05, Lai Jiangshan wrote:
>
> DEADLOCK EXAMPLE for explain my above option:
>
> (work_func0() and work_func1() are work callback, and they
> calls flush_workqueue())
>
> CPU#0 CPU#1
> run_workqueue() run_workqueue()
> work_func0() work_func1()
> flush_workqueue() flush_workqueue()
> flush_cpu_workqueue(0) .
> flush_cpu_workqueue(cpu#1) flush_cpu_workqueue(cpu#0)
> waiting work_func1() in cpu#1 waiting work_func0 in cpu#0
>
> DEADLOCK!

I am not sure. Note that when work_func0() calls run_workqueue(),
it will clear cwq->current_work, so another flush_ on CPU#1 will
not wait for work_func0, no?

But anyway. Nobody argues, "if (cwq->thread == current) {...}" code in
flush_cpu_workqueue() is bad and should die. Otrherwise, we should
fix the lockdep warning ;)

The only problem: if we still have the users of this hack, they will
deadlock. But perhaps it is time to fix them.

And, if it was not clear, I do agree with this change. And Peter
seems to agree as well.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/