RE: [PATCH 2/2] mmc: core: Add no single read retries

From: DooHyun Hwang
Date: Wed Feb 17 2021 - 03:47:02 EST



On 17/02/21 8:00 am, Adrian Hunter wrote:
>On 17/02/21 7:46 am, Adrian Hunter wrote:
>> On 17/02/21 7:22 am, DooHyun Hwang wrote:
>>> This makes to handle read errors faster by not retrying multiple
>>> block read(CMD18) errors with single block reads(CMD17).
>>>
>>> On some bad SD Cards that have problem with read operations, it is
>>> not helpful to retry multiple block read errors with several single
>>> block reads, and it is delayed to treat read operations as I/O error
>>> as much as retrying single block reads.
>>
>> If the issue is that it takes too long, then maybe it would be better
>> to get
>> mmc_blk_read_single() to give up after a certain amount of time.
>>
>
>So that a device property would not be needed I mean. Then everyone would
>benefit.

Thank you for reviewing this.

mmc_blk_read_single() takes a different time depending on the number of
sectors to read and the timeout value for each CMD.

I think it's difficult to set the criteria for "a certain amount of time"
you talked about.
And it's harder to proceed with any errors caused by giving up in
mmc_blk_read_single() than no retrying.

So, I would like to add a configurable property to skip the single block
read retrying because if multiple block read error occurs, single block
read retrying doesn't help for some bad SD cards.

This is the log to check for this patch.
#0. time difference is about 2.37s for 8 sectors between with(#1) and without(#2)
single block read retrying
This is a test for just one CMD18.
When there are many I/O requests, it takes too long to handle the errors.

#1. retry multiple block read (8 sectors) error with single block reads
// It takes about 3.585671s for the I/O error.
// issue CMD23 (+ arg 0x8)
// issue CMD18 (+ arg 0x000320e0) and error occurs
<7>[ 316.657115] [5: kworker/5:1H: 324] <mmc0: starting CMD23 arg 00000008 flags 00000015>
<7>[ 316.657124] [5: kworker/5:1H: 324] mmc0: starting CMD18 arg 000320e0 flags 000000b5
<7>[ 316.826302] I[0: swapper/0: 0] mmc0: req done <CMD23>: 0: 00000000 00000000 00000000 00000000
<7>[ 316.826327] I[0: swapper/0: 0] mmc0: req done (CMD18): 0: 00000900 00000000 00000000 00000000
<7>[ 316.826362] I[0: swapper/0: 0] mmc0: 0 bytes transferred: -110
<7>[ 316.826389] I[0: swapper/0: 0] mmc0: (CMD12): 0: 00000b00 00000000 00000000 00000000
<7>[ 316.826516] [0: kworker/0:1H: 338] mmc0: starting CMD13 arg 00010000 flags 00000195
<7>[ 316.826621] I[0: kworker/0:1H: 338] mmc0: req done (CMD13): 0: 00000900 00000000 00000000 00000000
// retry CMD18 (+ arg 0x000320e0) and error occurs again. Same as above.
<7>[ 316.829224] [5: kworker/5:1H: 324] <mmc0: starting CMD23 arg 00000008 flags 00000015>
<7>[ 316.829237] [5: kworker/5:1H: 324] mmc0: starting CMD18 arg 000320e0 flags 000000b5
<7>[ 316.999588] I[0: swapper/0: 0] mmc0: req done <CMD23>: 0: 00000000 00000000 00000000 00000000
<7>[ 316.999653] I[0: swapper/0: 0] mmc0: req done (CMD18): 0: 00000900 00000000 00000000 00000000
<7>[ 316.999725] I[0: swapper/0: 0] mmc0: 0 bytes transferred: -110
<7>[ 316.999789] I[0: swapper/0: 0] mmc0: (CMD12): 0: 00000b00 00000000 00000000 00000000
<7>[ 317.000034] [0: kworker/0:1H: 338] mmc0: starting CMD13 arg 00010000 flags 00000195
<7>[ 317.000370] I[0: kworker/0:1H: 338] mmc0: req done (CMD13): 0: 00000900 00000000 00000000 00000000
// mmc_blk_reset() and it's completed
<7>[ 317.000523] [0: kworker/0:1H: 338] mmc0: clock 0Hz busmode 2 powermode 0 cs 0 Vdd 0 width 1 timing 0
...
// mmc_blk_read_single() : CMD17, CMD13 and CMD12 repeats 8 times (for retrying multiple block read with 8 sectors)
// CMD17 (+ arg 0x000320e0 ~ 0x000320e7) and timeout errors occur
// It takes about 1.351s
<7>[ 317.200351] [0: kworker/0:1H: 338] mmc0: starting CMD17 arg 000320e0 flags 000000b5
<7>[ 317.368748] I[0: swapper/0: 0] mmc0: req done (CMD17): 0: 00000900 00000000 00000000 00000000
<7>[ 317.368776] I[0: swapper/0: 0] mmc0: 0 bytes transferred: -110
<7>[ 317.368871] [0: kworker/0:1H: 338] mmc0: starting CMD13 arg 00010000 flags 00000195
<7>[ 317.368932] I[0: kworker/0:1H: 338] mmc0: sdhci: IRQ status 0x00000001
<7>[ 317.368970] I[0: kworker/0:1H: 338] mmc0: req done (CMD13): 0: 00000b00 00000000 00000000 00000000
<7>[ 317.369020] [0: kworker/0:1H: 338] mmc0: starting CMD12 arg 00000000 flags 00000095
<7>[ 317.369070] I[0: kworker/0:1H: 338] mmc0: sdhci: IRQ status 0x00000001
<7>[ 317.369108] I[0: kworker/0:1H: 338] mmc0: req done (CMD12): 0: 00000b00 00000000 00000000 00000000
<7>[ 317.369155] [0: kworker/0:1H: 338] mmc0: starting CMD13 arg 00010000 flags 00000195
<7>[ 317.369204] I[0: kworker/0:1H: 338] mmc0: sdhci: IRQ status 0x00000001
<7>[ 317.369245] I[0: kworker/0:1H: 338] mmc0: req done (CMD13): 0: 00000900 00000000 00000000 00000000
<3>[ 317.369298] [0: kworker/0:1H: 338] print_req_error: I/O error, dev mmcblk0, sector 205024
<7>[ 317.369342] [0: kworker/0:1H: 338] mmc0: starting CMD17 arg 000320e1 flags 000000b5
...
<7>[ 318.382668] [0: kworker/0:1H: 338] mmc0: starting CMD17 arg 000320e7 flags 000000b5
<3>[ 318.551568] [0: kworker/0:1H: 338] print_req_error: I/O error, dev mmcblk0, sector 205031
// retry CMD18 (+ arg 0x000320e0) and error occurs again.
<7>[ 318.551850] [5: kworker/5:1H: 324] <mmc0: starting CMD23 arg 00000008 flags 00000015>
<7>[ 318.551867] [5: kworker/5:1H: 324] mmc0: starting CMD18 arg 000320e0 flags 000000b5
...
// retry CMD18 (+ arg 0x000320e0) and error occurs again.
<7>[ 318.721767] [5: kworker/5:1H: 324] mmc0: starting CMD18 arg 000320e0 flags 000000b5
// CMD17 (+ arg 0x000320e0 ~ 0x000320e7)
<7>[ 318.891054] [0: kworker/0:1H: 338] mmc0: starting CMD17 arg 000320e0 flags 000000b5
...
<7>[ 320.073861] [0: kworker/0:1H: 338] mmc0: starting CMD17 arg 000320e7 flags 000000b5
// Return I/O error for read operation finally
<3>[ 320.242786] [0: kworker/0:1H: 338] Buffer I/O error on dev mmcblk0, logical block 25628, async page read

#2. retry multiple block read (8 sectors) error without single block reads
// It takes about 1.205941s for the I/O error.
// issue CMD23 (+ arg 0x8)
// issue CMD18 (+ arg 0x000320e0) and error occurs
<7>[ 126.467114] [7: kworker/7:2H: 8887] <mmc0: starting CMD23 arg 00000008 flags 00000015>
<7>[ 126.467125] [7: kworker/7:2H: 8887] mmc0: starting CMD18 arg 000320e0 flags 000000b5
<7>[ 126.636188] I[0:Measurement Wor: 9074] mmc0: req done <CMD23>: 0: 00000000 00000000 00000000 00000000
<7>[ 126.636213] I[0:Measurement Wor: 9074] mmc0: req done (CMD18): 0: 00000900 00000000 00000000 00000000
<7>[ 126.636241] I[0:Measurement Wor: 9074] mmc0: 0 bytes transferred: -110
<7>[ 126.636265] I[0:Measurement Wor: 9074] mmc0: (CMD12): 0: 00000b00 00000000 00000000 00000000
<7>[ 126.636379] [0: kworker/0:1H: 336] mmc0: starting CMD13 arg 00010000 flags 00000195
<7>[ 126.636495] I[0: kworker/0:1H: 336] mmc0: req done (CMD13): 0: 00000900 00000000 00000000 00000000
// retry CMD18 (+ arg 0x000320e0) and error occurs again. Same as above.
<7>[ 126.638284] [7: kworker/7:2H: 8887] <mmc0: starting CMD23 arg 00000008 flags 00000015>
<7>[ 126.638298] [7: kworker/7:2H: 8887] mmc0: starting CMD18 arg 000320e0 flags 000000b5
// mmc_blk_reset() and it's completed
<7>[ 126.807645] [0: kworker/0:1H: 336] mmc0: clock 0Hz busmode 2 powermode 0 cs 0 Vdd 0 width 1 timing 0
...
// no mmc_blk_read_single() calling
// retry CMD18 (+ arg 0x000320e0) and error occurs again.
<7>[ 126.993628] [7: kworker/7:2H: 8887] <mmc0: starting CMD23 arg 00000008 flags 00000015>
<7>[ 126.993643] [7: kworker/7:2H: 8887] mmc0: starting CMD18 arg 000320e0 flags 000000b5
// retry CMD18 (+ arg 0x000320e0) and error occurs again.
<7>[ 127.164836] [7: kworker/7:2H: 8887] <mmc0: starting CMD23 arg 00000008 flags 00000015>
<7>[ 127.164848] [7: kworker/7:2H: 8887] mmc0: starting CMD18 arg 000320e0 flags 000000b5
...
// Return I/O error for read operation finally
<3>[ 127.673055] I[7: swapper/7: 0] Buffer I/O error on dev mmcblk0, logical block 25628, async page read


Thank you.