Re: 2.6.24-rc3-mm1: I/O error, system hangs

From: James Bottomley
Date: Sat Nov 24 2007 - 12:44:38 EST


Probing intermittent failures in Domain Validation, even with the fixes
applied leads me to the conclusion that there are further problems with
this commit:

commit fc5eb4facedbd6d7117905e775cee1975f894e79
Author: Hannes Reinecke <hare@xxxxxxx>
Date: Tue Nov 6 09:23:40 2007 +0100

[SCSI] Do not requeue requests if REQ_FAILFAST is set

The essence of the problems is that you're causing REQ_FAILFAST to
terminate commands with error on requeuing conditions, some of which are
relatively common on most SCSI devices. While this may be the correct
behaviour for multi-path, it's certainly wrong for the previously
understood meaning of REQ_FAILFAST, which was don't retry on error,
which is why domain validation and other applications use it to control
error handling, but don't expect to get failures for a simple requeue
are now spitting errors.

I honestly can't see that, even for the multi-path case, returning an
error when we're over queue depth is the correct thing to do (it may not
matter to something like a symmetrix, but an array that has a non-zero
cost associated with a path change, like a CPQ HSV or the AVT
controllers, will show fairly large slow downs if you do this). Even if
this is the desired behaviour (and I think that's a policy issue),
DID_NO_CONNECT is almost certainly the wrong error to be sending back.

This patch fixes up domain validation to work again correctly, however,
I really think it's just a bandaid. Do you want to rethink the above
commit?

James

Index: BUILD-2.6/drivers/scsi/scsi_lib.c
===================================================================
--- BUILD-2.6.orig/drivers/scsi/scsi_lib.c 2007-11-24 11:25:20.000000000 -0600
+++ BUILD-2.6/drivers/scsi/scsi_lib.c 2007-11-24 11:26:22.000000000 -0600
@@ -1552,7 +1552,8 @@ static void scsi_request_fn(struct reque
break;

if (!scsi_dev_queue_ready(q, sdev)) {
- if (req->cmd_flags & REQ_FAILFAST) {
+ if ((req->cmd_flags & REQ_FAILFAST) &&
+ !(req->cmd_flags & REQ_PREEMPT)) {
scsi_kill_request(req, q);
continue;
}


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/