Re: [RFC] deadlock with flush_work() in UAS

From: Alan Stern
Date: Mon Jun 24 2019 - 10:22:14 EST


On Mon, 24 Jun 2019, Oliver Neukum wrote:

> Am Donnerstag, den 20.06.2019, 07:10 -0700 schrieb Tejun Heo:
> > Hello,
> >
> > On Tue, Jun 18, 2019 at 11:59:39AM -0400, Alan Stern wrote:
> > > > > Even if you disagree, perhaps we should have a global workqueue with a
> > > > > permanently set noio flag. It could be shared among multiple drivers
> > > > > such as uas and the hub driver for purposes like this. (In fact, the
> > > > > hub driver already has its own dedicated workqueue.)
> > > >
> > > > That is a good idea. But does UAS need WQ_MEM_RECLAIM?
> > >
> > > These are good questions, and I don't have the answers. Perhaps Tejun
> > > or someone else on LKML can help.
> >
> > Any device which may host a filesystem or swap needs to use
> > WQ_MEM_RECLAIM workqueues on anything which may be used during normal
> > IOs including e.g. error handling which may be invoked. One
> > WQ_MEM_RECLAIM workqueue guarantees one level of concurrency for all
> > its tasks regardless of memory situation, so as long as there's no
> > interdependence between work items, the workqueue can be shared.
>
> Ouch.
>
> Alan, in that case anything doing a reset, suspend or resume needs
> to use WQ_MEM_RECLAIM, it looks to me. What do we do?

I'm not sure this is really a problem.

For example, the reset issue arises only when a driver does the
following:

Locks the device.

Queues a work routine to reset the device.

Waits for the reset to finish.

Unlocks the device.

But that pattern makes no sense; a driver would never use it. The
driver would just do the reset itself.

There's no problem if the locking is done in the work routine; in that
case the usb-storage or uas driver would be able to carry out any
necessary resets if the work routine was unable to start for lack of
memory.

Similarly, while async wakeups might get blocked by lack of memory, the
normal USB driver paths use synchronous wakeup.

Alan Stern