James Bottomley [James.Bottomley@SteelEye.com] wrote:
> James.Bottomley@steeleye.com said:
>
> Unfortunately, this is going to involve deep hackery inside the error handler.
> The current initial premise is that it can simply retry the failing command
> by issuing an ABORT to the tag and resending it (which can cause a tag to move
> past your barrier). In an error situation, it really wouldn't be wise to try
> to abort lots of potentially running tags to preserve the barrier ordering
> (because of the overload placed on a known failing component), so I think the
> error handler has to abandon the concept of aborting commands and move
> straight to device reset. We then carefully resend the commands in FIFO order.
>
> Additionally, you must handle the case that a device is reset by something
> else (in error handler terms, the cc_ua [check condition/unit attention]).
> Here also, the tags would have to be sent back down in FIFO order as soon as
> the condition is detected.
I agree on the hacker. You also will need to clean the pipe of the unit
attention post the reset or the first command you send down will
be coming right back into the error handler and could get you in to a
error recovery storm.
>
> James
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
-- Michael Anderson andmike@us.ibm.com- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Fri Feb 15 2002 - 21:01:08 EST