Re: [PATCH] usb: uas: fix usb subsystem hang after power off hub port

From: Alan Stern
Date: Tue Apr 09 2019 - 10:44:08 EST


On Mon, 8 Apr 2019, Martin K. Petersen wrote:

>
> Alan,
>
> > So it looks as though the SCSI subsystem doesn't like to have a reset
> > handler call scsi_remove_host.
>
> Are you talking about a PCI device removal handler or a SCSI error
> handler?

The context of this discussion is a USB mass-storage device where the
device's port on its upstream hub has been powered off. The
powered-off port causes an executing command to time out. As a result
the SCSI error handler runs and calls the USB reset routine, but the
reset fails because the kernel is unable to communicate with the device
through the powered-off port. This causes the USB reset routine to
unbind the device from its USB driver, which in turn calls
scsi_remove_host -- while the error handler is still running.

> > Commands dispatched by the removal routines are forced to wait for the
> > reset recovery to finish, which won't happen until those commands have
> > been completed.
> >
> > Is this a bug in the SCSI core? If not, we need to know what is the
> > right way to do things when a reset handler detects that the SCSI host
> > has been hot-unplugged.
>
> PCI surprise removal should generally work. But it's somewhat unusual
> for a SCSI host to evaporate in the middle of error handling. After all,
> the main purpose of eh is to leverage the interfaces provided by the
> host to try to reconnect to a target that tripped and fell off the
> bus...

Still, it's not impossible for a SCSI host to evaporate in the middle
of error handling, given an appropriately mistimed hot-unplug event.
How does the SCSI layer expect this to be handled? Should the
low-level driver wait to call scsi_remove_host until after the error
handling is finished?

What about races? In theory, scsi_remove_host could be called just as
the error handler is starting up.

Alan Stern