Re: [PATCH 3/3] target: simplify target_wait_for_sess_cmds()

From: Nicholas A. Bellinger
Date: Mon May 13 2013 - 23:06:24 EST


On Mon, 2013-05-13 at 18:00 -0400, JÃrn Engel wrote:
> On Mon, 13 May 2013 16:00:03 -0700, Nicholas A. Bellinger wrote:
> > On Mon, 2013-05-13 at 16:30 -0400, Joern Engel wrote:
> > > The second parameter was always 0, leading to effectively dead code. It
> > > called list_del() and se_cmd->se_tfo->release_cmd(), and had to set a
> > > flag to prevent target_release_cmd_kref() from doing the same.
> >
> > Look again. The call to transport_wait_for_tasks() was dead code, but
> > the wait_for_completion(&se_cmd->cmd_wait_comp) most certainly is not.
>
> See "totally wrong" below.
>
> > > But most
> > > of all, it iterated the list without taking se_sess->sess_cmd_lock,
> > > leading to races against ABORT and LUN_RESET.
> >
> > Ugh. You'll recall that target_wait_for_sess_cmds() originally did not
> > have to take the lock because the list was spliced into
> > sess_wait_list.
> >
> > When Roland removed sess_wait_list in commit 1c7b13fe, no one re-added
> > the lock here.
>
> Interesting point. That seems to imply that reverting 1c7b13fe would
> be an alternative approach.
>
> > > Since the whole point of the function is to wait for the list to drain,
> > > and potentially print a bit of debug information in case that never
> > > happens, I've replaced the wait_for_completion() with 100ms sleep. The
> > > only callpath that would get delayed by this is rmmod, afaics, so I
> > > didn't want the overhead of a waitqueue.
> >
> > This seems totally wrong..
>
> The wait_for_completion() was not dead code, but it was just one
> possible implementation of "wait for the list to drain". I dislike
> that particular implementation because you have to drop the spinlock
> before waiting and at the same time wait for a specific command.
> Since you no longer hold any locks, the command can say *poof* and
> disappear from under you at any point.

So I'd rather re-instate the list splice within
target_sess_cmd_list_set_waiting(), keep target_wait_for_sess_cmds()
lock-less performing wait_for_completions() on cmd_wait_comp, and keep
the existing cmd_wait_set assignment + check in place.


> Indeed it has to. So maybe
> you could take a refcount while waiting for this command to prevent
> that, which implies you have to check for this special refcount
> elsewhere and...

I don't think that will be necessary..

>
> At this point most readers should shudder in disgust and look for some
> alternate implementation. I don't say mine is perfect, but at least
> it does not care about any particular command.
>
> > > - if (!rc) {
> > > - wait_for_completion(&se_cmd->cmd_wait_comp);
> > > - pr_debug("After cmd_wait_comp: se_cmd: %p t_state: %d"
> > > - " fabric state: %d\n", se_cmd, se_cmd->t_state,
> > > - se_cmd->se_tfo->get_cmd_state(se_cmd));
> > > + spin_lock_irqsave(&se_sess->sess_cmd_lock, flags);
> > > + while (!list_empty(&se_sess->sess_cmd_list)) {
> > > + se_cmd = list_entry(se_sess->sess_cmd_list.next, struct se_cmd,
> > > + se_cmd_list);
> > > + if (se_cmd != last_cmd) { /* print this only once per command */
> > > + pr_debug("Waiting for se_cmd: %p t_state: %d, fabric state: %d\n",
> > > + se_cmd, se_cmd->t_state,
> > > + se_cmd->se_tfo->get_cmd_state(se_cmd));
> > > + last_cmd = se_cmd;
> > > }
> > > -
> > > - se_cmd->se_tfo->release_cmd(se_cmd);
> > > + spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags);
> > > + msleep_interruptible(100);
> > > + spin_lock_irqsave(&se_sess->sess_cmd_lock, flags);
> > > }
> > > + spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags);
> > > }
> >
> > So what happens when the backend se_cmd I/O does not complete before the
> > msleep finishes..?
>
> You take the lock, check list_empty() and go to sleep again. Repeat
> until the backend I/O does complete or you get a hung task, whichever
> comes first.
>
> Which is exactly the same behaviour you had before.
>
> > It seems totally wrong to drop the initial cmd_wait_set =1 assignment,
> > target_release_cmd_kref() completion for cmd_wait_comp, and wait on
> > cmd_wait_comp to allow se_cmd to finish processing here.
> >
> > Who cares about waitqueue overhead in a shutdown slow path when the
> > replacement doesn't address long outstanding commands..?
>
> I agree that the overhead doesn't matter. The msleep(100) spells this
> out rather explicitly. What does matter is that a) the patch retains
> old behaviour with much simpler code and b) it fixes a race that kills
> the machine. I can live without a, but very much want to keep b. ;)
>

Fucking around with ->sess_cmd_lock during each loop of ->sess_cmd_list
in target_wait_for_sess_cmds is not simpler code..

Please re-spin a patch that re-instates the list splice part of commit
1c7b13fe6, and only drops the wait_for_tasks case check in
target_wait_for_sess_cmds()

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/