Re: [PATCH 1/2] scsi: sas: flush destruct workqueue on device unregister

From: Johannes Thumshirn
Date: Wed Mar 29 2017 - 07:30:03 EST


On Wed, Mar 29, 2017 at 12:15:44PM +0100, John Garry wrote:
> On 29/03/2017 10:41, Johannes Thumshirn wrote:
> >In the advent of an SAS device unregister we have to wait for all destruct
> >works to be done to not accidently delay deletion of a SAS rphy or it's
> >children to the point when we're removing the SCSI or SAS hosts.
> >
> >Signed-off-by: Johannes Thumshirn <jthumshirn@xxxxxxx>
> >---
> > drivers/scsi/libsas/sas_discover.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> >diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c
> >index 60de662..75b18f1 100644
> >--- a/drivers/scsi/libsas/sas_discover.c
> >+++ b/drivers/scsi/libsas/sas_discover.c
> >@@ -382,9 +382,13 @@ void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *dev)
> > }
> >
> > if (!test_and_set_bit(SAS_DEV_DESTROY, &dev->state)) {
> >+ struct sas_discovery *disc = &dev->port->disc;
> >+ struct sas_work *sw = &disc->disc_work[DISCE_DESTRUCT].work;
> >+
> > sas_rphy_unlink(dev->rphy);
> > list_move_tail(&dev->disco_list_node, &port->destroy_list);
> > sas_discover_event(dev->port, DISCE_DESTRUCT);
> >+ flush_work(&sw->work);
>
> I quickly tested plugging out the expander and we never get past this call
> to flush - a hang results:

Can you activat lockdep so we can see which lock it is that we're blocking on?

It's most likely in sas_unregister_common_dev() but this function takes two spin
locks, port->dev_list_lock and ha->lock.

Thanks a lot,
Johannes

--
Johannes Thumshirn Storage
jthumshirn@xxxxxxx +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850