Re: [PATCH 07/17] hpsa: do not give up retry of driver cmds after only 3 retries

From: scameron
Date: Tue May 01 2012 - 14:20:15 EST


On Tue, May 01, 2012 at 07:26:34PM +0200, Andi Shyti wrote:
> Hi,
>
> > --- a/drivers/scsi/hpsa.c
> > +++ b/drivers/scsi/hpsa.c
> > @@ -1380,17 +1380,24 @@ static void hpsa_scsi_do_simple_cmd_core_if_no_lockup(struct ctlr_info *h,
> > }
> > }
> >
> > +#define MAX_DRIVER_CMD_RETRIES 25
> > static void hpsa_scsi_do_simple_cmd_with_retry(struct ctlr_info *h,
> > struct CommandList *c, int data_direction)
> > {
> > - int retry_count = 0;
> > + int backoff_time = 10, retry_count = 0;
> >
> > do {
> > memset(c->err_info, 0, sizeof(*c->err_info));
> > hpsa_scsi_do_simple_cmd_core(h, c);
> > retry_count++;
> > + if (retry_count > 3) {
> > + msleep(backoff_time);
>
> for 10ms isn't it better to avoid using msleep?

[...]
> > + if (backoff_time < 1000)
> > + backoff_time *= 2;

Eh, maybe. from Documentation/timers-howto.txt

msleep(1~20) may not do what the caller intends, and
will often sleep longer (~20 ms actual sleep for any
value given in the 1~20ms range). In many cases this
is not the desired behavior.

Sleeping longer (~20ms instead of 10ms) in this instance is fine, as I don't
really care too much exactly how long it sleeps, and it backs off to up to
1280ms eventually anyway. The idea is, "wait a bit, and retry, and then if
that doesn't work, wait twice as long, and retry, etc." *exactly* how long
"a bit" is is not super important. I could change the initial back_off time
to 20 or 30 to satisfy the letter of the advice in Documentation/timers-howto.txt,
if doing so is important.

This is kind of a corner case of a corner case, I don't expect
things will ordinarily end up waiting that long, because normally
one of the 1st 3 retries will succeed. I just wanted to make it
a little more robust and not just give up immediately if the 3
initial retries don't succeed, the specific number of retries,
wait times, etc, I just made up. It still does eventually give up
though, and then probably doesn't do anything good after that
(same as current behavior, just somewhat less likely to get to
that point.) I'm not actually aware of any complaints of the
retries failing though (apart from the complaint that prompted
the patch prior to this one, that we didn't retry on getting
SAM_STAT_BUSY.)

-- steve

> > + }
> > } while ((check_for_unit_attention(h, c) ||
> > - check_for_busy(h, c)) && retry_count <= 3);
> > + check_for_busy(h, c)) &&
> > + retry_count <= MAX_DRIVER_CMD_RETRIES);
> > hpsa_pci_unmap(h->pdev, c, 1, data_direction);
> > }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/