Re: [PATCH] mmc_spi: do propper retry managment in the block layer- rewritten

From: Matt Fleming
Date: Mon May 18 2009 - 16:36:10 EST


On Mon, May 18, 2009 at 06:00:40PM +0200, Wolfgang Mües wrote:
> Matt,
>
> Am Montag, 18. Mai 2009 schrieb Matt Fleming:
> >
> > When you say "proper retry management", what problem are you having
> > with the current code?
>
> a) Writes were not retried at all, so every transmission error created a data
> loss.
> b) There was no clear distinction between errors which comes from transmission
> faults and errors which are by nature non-recoverable.
> c) After one error, the remaining sectors were read one by one. This is
> unneccessary time-consuming.
>
> Cause a) was my initial problem with this code. This was not production
> quality.
>

Ah, now I understand. I've also been contacted privately by someone
who says that they've been experiencing similar issues and that your
patch has fixed them. Clearly there are people out there with flaky
hardware ;-)

> > > @@ -262,10 +262,12 @@ static int mmc_blk_issue_rq(struct mmc_q
> > > if (brq.data.blocks > 1) {
> > > /* SPI multiblock writes terminate using a special
> > > * token, not a STOP_TRANSMISSION request.
> > > + * Here, this request is set for ALL types of
> > > + * hosts, so that we can use it in the lower
> > > + * layers if the data transfer stage has failed
> > > + * and the card is not able to accept the token.
> > > */
> > > - if (!mmc_host_is_spi(card->host)
> > > - || rq_data_dir(req) == READ)
> > > - brq.mrq.stop = &brq.stop;
> > > + brq.mrq.stop = &brq.stop;
> > > readcmd = MMC_READ_MULTIPLE_BLOCK;
> > > writecmd = MMC_WRITE_MULTIPLE_BLOCK;
> > > } else {
> >
> > Hmm.. is it OK to abuse the STOP_TRANSMISSION request like this? Does
> > SPI multiblock write still terminate with this patch?
>
> IMHO, this is the right thing to do. If you have a transmission error
> while SPI multiblock write, the card will miss the stop token. So the
> card is in multiblock write mode and has to be stopped. You cannot
> retry the stop token. Every card I have tested accepted the STOP_TRANSMISSION
> request. So it is better to try a STOP_TRANSMISSION, because this will
> recover communication with the card. If a particular card will not respond to
> the STOP_TRANSMISSION, the situation is not worse as before...
>
> The STOP_TRANSMISSION request is only send to the card if the card
> has not responded to the stop token (see the change in mmc_spi.c).
> The default non-error behaviour has not changed.
>

I'll defer to Pierre on that one.

I feel that retrying data accesses should be done at a higher layer
than the MMC/SD subsystem. But maybe I'm just being impractical. This
patch solves a real world problem that is biting some people, and so
based on that, the approach seems OK to me FWIW.


diff -uprN 2_6_29_rc7_patch_EILSEQ/drivers/mmc/host/mmc_spi.c 2_6_29_rc7_patch_retries/drivers/mmc/host/mmc_spi.c
--- 2_6_29_rc7_patch_EILSEQ/drivers/mmc/host/mmc_spi.c 2009-05-14 12:49:42.000000000 +0200
+++ 2_6_29_rc7_patch_retries/drivers/mmc/host/mmc_spi.c 2009-05-14 15:01:15.000000000 +0200
@@ -743,12 +743,11 @@ mmc_spi_writeblock(struct mmc_spi_host *
if (status != 0) {
dev_dbg(&spi->dev, "write error %02x (%d)\n",
scratch->status[0], status);
- return status;
- }
-
+ } else {
t->tx_buf += t->len;
if (host->dma_dev)
t->tx_dma += t->len;
+ }

By the way, the indentation needs fixing here.

@@ -1087,7 +1089,15 @@ static void mmc_spi_request(struct mmc_h
status = mmc_spi_command_send(host, mrq, mrq->cmd, mrq->data != NULL);
if (status == 0 && mrq->data) {
mmc_spi_data_do(host, mrq->cmd, mrq->data, mrq->data->blksz);
- if (mrq->stop)
+ /* filter-out the stop command for multiblock writes,
+ * only if the data stage has no transmission error.
+ * If the data stage has a transmission error, send the
+ * STOP command because there is a great chance that the
+ * SPI stop token was not accepted by the card.
+ */
+ if (mrq->stop &&
+ ((mrq->data->flags & MMC_DATA_READ)
+ || mrq->data->error))
status = mmc_spi_command_send(host, mrq, mrq->stop, 0);
else
mmc_cs_off(host);

And here..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/