Re: net/bluetooth: workqueue destruction WARNING in hci_unregister_dev

From: Tejun Heo
Date: Fri Mar 18 2016 - 16:52:42 EST


Hello, Jiri.

On Thu, Mar 17, 2016 at 01:00:13PM +0100, Jiri Slaby wrote:
> >> I have not done that yet, but today, I see:
> >> destroy_workqueue: name='req_hci0' pwq=ffff88002f590300
> >> wq->dfl_pwq=ffff88002f591e00 pwq->refcnt=2 pwq->nr_active=0 delayed_works:
> >> pwq 12: cpus=0-1 node=0 flags=0x4 nice=-20 active=0/1
> >> in-flight: 18568:wq_barrier_func
> >
> > So, this means that there's flush_work() racing against workqueue
> > destruction, which can't be safe. :(
>
> But I cannot trigger the WARN_ONs in the attached patch, so I am
> confused how this can happen :(. (While I am still seeing the destroy
> WARNINGs.)

So, no operations should be in progress when destroy_workqueue() is
called. If somebody was flushing a work item, the flush call must
have returned before destroy_workqueue() was invoked, which doesn't
seem to be the case here. Can you trigger BUG_ON() or sysrq-t when
the above triggers? There must be a task which is flushing a work
item there and it shouldn't be difficult to pinpoint what's going on
from it.

Thanks.

--
tejun