Re: [50/98] [SCSI] libfc: fix free of fc_rport_priv with timerpending

From: Greg KH
Date: Tue Jan 26 2010 - 21:05:39 EST


On Tue, Jan 26, 2010 at 04:03:13PM -0800, Joe Eykholt wrote:
> Greg KH wrote:
> > 2.6.32-stable review patch. If anyone has any objections, please let us know.
> >
> > ------------------
> >
> > From: Joe Eykholt <jeykholt@xxxxxxxxx>
> >
> > commit b4a9c7ede96e90f7b1ec009ce7256059295e76df upstream.
> >
> > Timer crashes were caused by freeing a struct fc_rport_priv
> > with a timer pending, causing the timer facility list to be
> > corrupted. This was during FC uplink flap tests with a lot
> > of targets.
> >
> > After discovery, we were doing an PLOGI on an rdata that was
> > in DELETE state but not yet removed from the lookup list.
> > This moved the rdata from DELETE state to PLOGI state.
> > If the PLOGI exchange allocation failed and needed to be
> > retried, the timer scheduling could race with the free
> > being done by fc_rport_work().
> >
> > When fc_rport_login() is called on a rport in DELETE state,
> > move it to a new state RESTART. In fc_rport_work, when
> > handling a LOGO, STOPPED or FAILED event, look for restart
> > state. In the RESTART case, don't take the rdata off the
> > list and after the transport remote port is deleted and
> > exchanges are reset, re-login to the remote port.
> >
> > Note that the new RESTART state also corrects a problem we
> > had when re-discovering a port that had moved to DELETE state.
> > In that case, a new rdata was created, but the old rdata
> > would do an exchange manager reset affecting the FC_ID
> > for both the new rdata and old rdata. With the new state,
> > the new port isn't logged into until after any old exchanges
> > are reset.
> >
> > Signed-off-by: Joe Eykholt <jeykholt@xxxxxxxxx>
> > Signed-off-by: Robert Love <robert.w.love@xxxxxxxxx>
> > Signed-off-by: James Bottomley <James.Bottomley@xxxxxxx>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxx>
> >
> > ---
> > drivers/scsi/libfc/fc_rport.c | 69 ++++++++++++++++++++++++++++++------------
> > include/scsi/libfc.h | 1
> > 2 files changed, 51 insertions(+), 19 deletions(-)
> >
> > --- a/drivers/scsi/libfc/fc_rport.c
> > +++ b/drivers/scsi/libfc/fc_rport.c
> > @@ -86,6 +86,7 @@ static const char *fc_rport_state_names[
> > [RPORT_ST_LOGO] = "LOGO",
> > [RPORT_ST_ADISC] = "ADISC",
> > [RPORT_ST_DELETE] = "Delete",
> > + [RPORT_ST_RESTART] = "Restart",
> > };
> >
> > /**
> > @@ -99,8 +100,7 @@ static struct fc_rport_priv *fc_rport_lo
> > struct fc_rport_priv *rdata;
> >
> > list_for_each_entry(rdata, &lport->disc.rports, peers)
> > - if (rdata->ids.port_id == port_id &&
> > - rdata->rp_state != RPORT_ST_DELETE)
> > + if (rdata->ids.port_id == port_id)
> > return rdata;
> > return NULL;
> > }
> > @@ -235,6 +235,7 @@ static void fc_rport_work(struct work_st
> > struct fc_rport_operations *rport_ops;
> > struct fc_rport_identifiers ids;
> > struct fc_rport *rport;
> > + int restart = 0;
> >
> > mutex_lock(&rdata->rp_mutex);
> > event = rdata->event;
> > @@ -287,8 +288,19 @@ static void fc_rport_work(struct work_st
> > mutex_unlock(&rdata->rp_mutex);
> >
> > if (port_id != FC_FID_DIR_SERV) {
> > + /*
> > + * We must drop rp_mutex before taking disc_mutex.
> > + * Re-evaluate state to allow for restart.
> > + * A transition to RESTART state must only happen
> > + * while disc_mutex is held and rdata is on the list.
> > + */
> > mutex_lock(&lport->disc.disc_mutex);
> > - list_del(&rdata->peers);
> > + mutex_lock(&rdata->rp_mutex);
> > + if (rdata->rp_state == RPORT_ST_RESTART)
> > + restart = 1;
> > + else
> > + list_del(&rdata->peers);
>
> There is a follow-up patch that adds this line at this point:
>
> rdata->event = RPORT_EV_NONE;
>
> If this patch is integrated, that one should be integrated
> as well. That patch is:
>
> commit 5543c72e2bbb30e5ba5938b18ec26617b8b3fb04
> Author: Abhijeet Joglekar <abjoglek@xxxxxxxxx>
> Date: Thu Dec 10 09:59:20 2009 -0800
>
> [SCSI] libfc: remote port gets stuck in restart state without really restarting

Thank you so much for the information, I have queued up this patch as
well for the .32 tree.

thanks for the review,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/