Re: kernel BUG at block/blk-timeout.c:178!

From: Alan D. Brunelle
Date: Thu Dec 04 2008 - 16:07:18 EST

Alan D. Brunelle wrote:
> Alan D. Brunelle wrote:
>> Jens Axboe wrote:
>>> Alan, can you try latest -git? feaf3848a813a106f163013af6fcf6c4bfec92d9
>>> or later.
>> git pull()ed to: feaf3848a813a106f163013af6fcf6c4bfec92d9 and the same
>> problem occurs.
> Maybe not - I've not been to reproduce that problem in subsequent
> reboots. It could be that I booted the wrong kernel first time (rc6
> instead of rc7). Will keep plugging - any idea as to what might have
> "fixed" the problem between rc6 & rc7?
> Alan

It's back - just not as easily reproduced as before.

I'm concerned over this piece of code:

* hp_sw_tur - Send TEST UNIT READY
* @sdev: sdev command should be sent to
* Use the TEST UNIT READY command to determine
* the path state.
static int hp_sw_tur(struct scsi_device *sdev, struct hp_sw_dh_data *h)
struct request *req;
int ret;

req = blk_get_request(sdev->request_queue, WRITE, GFP_NOIO);
if (!req)

req->cmd_type = REQ_TYPE_BLOCK_PC;
req->cmd[0] = TEST_UNIT_READY;
req->timeout = HP_SW_TIMEOUT;
req->sense = h->sense;
memset(req->sense, 0, SCSI_SENSE_BUFFERSIZE);
req->sense_len = 0;

ret = blk_execute_rq(req->q, NULL, req, 1);
if (ret == -EIO) {
if (req->sense_len > 0) {
ret = tur_done(sdev, h->sense);
} else {
sdev_printk(KERN_WARNING, sdev,
"%s: sending tur failed with %x\n",
HP_SW_NAME, req->errors);
ret = SCSI_DH_IO;
} else {
h->path_state = HP_SW_PATH_ACTIVE;
ret = SCSI_DH_OK;
if (ret == SCSI_DH_IMM_RETRY)
goto retry;
if (ret == SCSI_DH_DEV_OFFLINED) {
h->path_state = HP_SW_PATH_PASSIVE;
ret = SCSI_DH_OK;


return ret;

I've pushed the BUG ON check into blk_execute_rq, and it's finding it
set there. Could we be getting SCSI_DH_IMM_RETRYs and that's causing the
same request to be used without being re-initialized, and on error the
bit is not being cleaned up properly?

I'm checking that out next...

