Re: WARNING: CPU: 0 PID: 1271 at drivers/mmc/core/core.c:991 mmc_release_host+0xa0/0xa8

From: John Stultz
Date: Fri Aug 12 2016 - 19:33:39 EST


On Fri, Aug 12, 2016 at 1:01 AM, Jaehoon Chung <jh80.chung@xxxxxxxxxxx> wrote:
> On 08/12/2016 04:13 PM, John Stultz wrote:
>> On Thu, Aug 4, 2016 at 9:52 PM, John Stultz <john.stultz@xxxxxxxxxx> wrote:
>>> Hey Ulf,
>>> Since moving my HiKey branch to pre-v4.8-rc1 (linus's HEAD), I've
>>> been seeing the following warning occasionally. Usually after seeing
>>> it, the system will refuse to reboot (system does the "Emergency
>>> remount complete" but then just sits there, and if I ctrl-c I can use
>>> the shell fine but many commands will get me stuck).
>>>
>>> Anyway, if you have any ideas...
>>>
>>> thanks
>>> -john
>>>
>>> [ 24.154245] ------------[ cut here ]------------
>>> [ 24.158903] WARNING: CPU: 2 PID: 1273 at
>>> drivers/mmc/core/core.c:991 mmc_release_host+0xa0/0xa8
>>> [ 24.167605]
>>> [ 24.169104] CPU: 2 PID: 1273 Comm: mmcqd/0 Not tainted
>>> 4.7.0-11945-gb30f1d6-dirty #706
>>> [ 24.177024] Hardware name: HiKey Development Board (DT)
>>> [ 24.182253] task: ffffffc0793d8c80 task.stack: ffffffc078c48000
>>> [ 24.188178] PC is at mmc_release_host+0xa0/0xa8
>>> [ 24.192725] LR is at mmc_put_card+0x18/0x3c
>>> [ 24.196917] pc : [<ffffff80086c2550>] lr : [<ffffff80086c31f4>]
>>> pstate: 80000145
>>> [ 24.204317] sp : ffffffc078c4bd20
>>> [ 24.207636] x29: ffffffc078c4bd20 x28: 0000000000000000
>>> [ 24.212975] x27: 0000000000000000 x26: ffffffc077903420
>>> [ 24.216220] x25: ffffffc078788028 x24: ffffffc0787e8800
>>> [ 24.216232] x23: ffffffc078788000 x22: 0000000000000000
>>> [ 24.216243] x21: 0000000000000000 x20: ffffffc078788018
>>> [ 24.216254] x19: ffffffc0787e8800 x18: 0000000000000000
>>> [ 24.216265] x17: 0000000000000000 x16: 0000000000000000
>>> [ 24.216276] x15: 0000000000000000 x14: ffffffc078789430
>>> [ 24.216288] x13: 000000000000002f x12: 000000000000b853
>>> [ 24.216299] x11: ffffffc077903420 x10: 0000000000000860
>>> [ 24.216310] x9 : ffffffc078c48000 x8 : ffffffc0793d9540
>>> [ 24.216322] x7 : 0000000000d3f8c7 x6 : 0000000000002bd0
>>> [ 24.216333] x5 : 00000000021458fa x4 : 00ffffffffffffff
>>> [ 24.216344] x3 : 00000000d0555555 x2 : ffffffc078c4bd5c
>>> [ 24.216355] x1 : 0000000000000000 x0 : 0000000000000000
>>> [ 24.216366]
>>> [ 24.216372] ---[ end trace 74dade4766b71d8d ]---
>>> [ 24.216377] Call trace:
>>> [ 24.216386] Exception stack(0xffffffc078c4bb50 to 0xffffffc078c4bc80)
>>> [ 24.216394] bb40:
>>> ffffffc0787e8800 0000008000000000
>>> [ 24.216403] bb60: ffffffc078c4bd20 ffffff80086c2550
>>> ffffff8008ca6000 ffffffc0784fb200
>>> [ 24.216411] bb80: ffffffc07bf4b7e8 0000000000000008
>>> ffffffc0793d8d00 ffffff8008c82780
>>> [ 24.216420] bba0: ffffffc078c4bbe0 ffffff800843576c
>>> ffffffc078c4bbf0 ffffff800843576c
>>> [ 24.216429] bbc0: ffffffc078c4bcc0 ffffffc078c4bc78
>>> ffffffc078c4bc10 ffffff800843576c
>>> [ 24.216437] bbe0: ffffffc078c4bce0 ffffffc078c4bc98
>>> 0000000000000000 0000000000000000
>>> [ 24.216445] bc00: ffffffc078c4bd5c 00000000d0555555
>>> 00ffffffffffffff 00000000021458fa
>>> [ 24.216452] bc20: 0000000000002bd0 0000000000d3f8c7
>>> ffffffc0793d9540 ffffffc078c48000
>>> [ 24.216460] bc40: 0000000000000860 ffffffc077903420
>>> 000000000000b853 000000000000002f
>>> [ 24.216467] bc60: ffffffc078789430 0000000000000000
>>> 0000000000000000 0000000000000000
>>> [ 24.216479] [<ffffff80086c2550>] mmc_release_host+0xa0/0xa8
>>> [ 24.216486] [<ffffff80086c31f4>] mmc_put_card+0x18/0x3c
>>> [ 24.216497] [<ffffff80086d30e4>] mmc_blk_issue_rq+0x11c/0x4a4
>>> [ 24.216506] [<ffffff80086d3e44>] mmc_queue_thread+0x98/0x158
>>> [ 24.216517] [<ffffff80080cfd7c>] kthread+0xd0/0xe4
>>> [ 24.216527] [<ffffff8008082e90>] ret_from_fork+0x10/0x40
>>
>>
>> Hey Ulf,
>> I *think* I've narrowed this down to
>> 6024e16654c1e1a2475e848d735963d05a12dba9 ("mmc: dw_mmc: set to
>> MMC_CAP_ERASE by default"). Its fairly sporadic so I may be seeing
>> this as a false positive, but after reverting that patch I've
>> seemingly stopped seeing the issue.
>
> Hmm, i don't think so. I *guess* it's not related with commit 6024e16654.

Well, with that patch reverted I've not seen the issue, but with it I
can trigger it fairly quickly.


> Before calling mmc_put_card(), is it issued the discard request?
>
> if ((!req && !(mq->flags & MMC_QUEUE_NEW_REQUEST)) ||
> (cmd_flags & MMC_REQ_SPECIAL_MASK))
>
> Which condition hit?

So I'm seeing those conditions hit quite a lot:

[ 45.056827] JDB: !req: 0 !flags&MMC_QUEUE_NEW_REQUEST: 1
is_special(req): 1
[ 45.066299] JDB: !req: 1 !flags&MMC_QUEUE_NEW_REQUEST: 1
is_special(req): 0
[ 45.135398] JDB: !req: 1 !flags&MMC_QUEUE_NEW_REQUEST: 1
is_special(req): 0
[ 45.143085] JDB: !req: 1 !flags&MMC_QUEUE_NEW_REQUEST: 1
is_special(req): 0
[ 45.152183] JDB: !req: 1 !flags&MMC_QUEUE_NEW_REQUEST: 1
is_special(req): 0
[ 45.161129] JDB: !req: 1 !flags&MMC_QUEUE_NEW_REQUEST: 1
is_special(req): 0


But right before I saw a warning:

[ 51.210393] JDB: !req: 0 !flags&MMC_QUEUE_NEW_REQUEST: 1
is_special(req): 1
[ 51.218085] ------------[ cut here ]------------
[ 51.222800] WARNING: CPU: 0 PID: 1273 at
drivers/mmc/core/core.c:264 mmc_start_request+0x100/0x108
[ 51.231805]
[ 51.233356] CPU: 0 PID: 1273 Comm: mmcqd/0 Not tainted
4.8.0-rc1-00056-gf1f4a8e-dirty #791
[ 51.241667] Hardware name: HiKey Development Board (DT)
[ 51.246940] task: ffffffc078c0b200 task.stack: ffffffc0786b4000
[ 51.252913] PC is at mmc_start_request+0x100/0x108
[ 51.257756] LR is at mmc_start_request+0x18/0x108
[ 51.262508] pc : [<ffffff80086c3090>] lr : [<ffffff80086c2fa8>]
pstate: 00000145
[ 51.269947] sp : ffffffc0786b7bf0
[ 51.273308] x29: ffffffc0786b7bf0 x28: 0000000000000000
[ 51.278743] x27: ffffffc078613018 x26: 0000000000000001
[ 51.284172] x25: 0000000000000000 x24: ffffffc0786b7d5c
[ 51.289602] x23: 0000000000000000 x22: ffffffc078613068
[ 51.295033] x21: ffffffc0786131e8 x20: ffffffc078611800
[ 51.300464] x19: ffffffc078613068 x18: 0000000000000000
[ 51.305891] x17: 0000000000000000 x16: 0000000000000000
[ 51.311320] x15: 00000000000011b1 x14: ffffff8008087f08
[ 51.316749] x13: ffffffc0786b002f x12: ffffff8008087f5c
[ 51.322179] x11: ffffffc0786b7d20 x10: 0000000000000000
[ 51.327607] x9 : 0000000000000000 x8 : 0000000000001000
[ 51.333036] x7 : ffffffc077392180 x6 : 000000001fde3fff
[ 51.338464] x5 : 0000000000000000 x4 : 000000001fde3000
[ 51.343892] x3 : 000000000000003f x2 : 0000000000000020
[ 51.349319] x1 : 0000000000000001 x0 : 0000000000000020
[ 51.354743]
[ 51.356279] ---[ end trace 2c3d9b4e1f4fc435 ]---
[ 51.360940] Call trace:
[ 51.363440] Exception stack(0xffffffc0786b7a20 to 0xffffffc0786b7b50)
[ 51.369942] 7a20: ffffffc078613068 0000008000000000
ffffffc0786b7bf0 ffffff80086c3090
[ 51.377830] 7a40: 0000000000000040 000000000000003f
000000004c7bd000 0000000000000000
[ 51.385719] 7a60: 000000001fde3fff ffffffc077392180
0000000000001000 0000000000000000
[ 51.393609] 7a80: 0000000000000000 ffffffc0786b7d20
ffffff8008087f5c ffffffc0786b002f
[ 51.401496] 7aa0: ffffff8008087f08 00000000000011b1
0000000000000000 0000000000000000
[ 51.409383] 7ac0: 0000000000000020 0000000000000001
0000000000000020 000000000000003f
[ 51.417271] 7ae0: 000000001fde3000 0000000000000000
000000001fde3fff ffffffc077392180
[ 51.425158] 7b00: 0000000000001000 0000000000000000
0000000000000000 ffffffc0786b7d20
[ 51.433046] 7b20: ffffff8008087f5c ffffffc0786b002f
ffffff8008087f08 00000000000011b1
[ 51.440928] 7b40: 0000000000000000 0000000000000000
[ 51.445863] [<ffffff80086c3090>] mmc_start_request+0x100/0x108
[ 51.451754] [<ffffff80086c3fe8>] mmc_start_req+0x280/0x3c8
[ 51.457305] [<ffffff80086d364c>] mmc_blk_issue_rw_rq+0xac/0x9b0
[ 51.463285] [<ffffff80086d3fec>] mmc_blk_issue_rq+0x9c/0x4d8
[ 51.469004] [<ffffff80086d4e00>] mmc_queue_thread+0x98/0x158
[ 51.474725] [<ffffff80080d04dc>] kthread+0xd0/0xe4
[ 51.479577] [<ffffff8008082e90>] ret_from_fork+0x10/0x40
[ 51.488484] ------------[ cut here ]------------

[ 51.774524] ------------[ cut here ]------------
[ 51.779150] WARNING: CPU: 0 PID: 1273 at
drivers/mmc/core/core.c:264 mmc_start_request+0x100/0x108
[ 51.788107]
[ 51.789599] CPU: 0 PID: 1273 Comm: mmcqd/0 Tainted: G W
4.8.0-rc1-00056-gf1f4a8e-dirty #791
[ 51.799078] Hardware name: HiKey Development Board (DT)
[ 51.804302] task: ffffffc078c0b200 task.stack: ffffffc0786b4000
[ 51.810222] PC is at mmc_start_request+0x100/0x108
[ 51.815012] LR is at mmc_start_request+0x18/0x108
[ 51.819715] pc : [<ffffff80086c3090>] lr : [<ffffff80086c2fa8>]
pstate: 40000145
[ 51.827109] sp : ffffffc0786b79b0
[ 51.830422] x29: ffffffc0786b79b0 x28: ffffffc0786b7bc8
[ 51.835749] x27: ffffff8008bcd5f0 x26: ffffff8008a7f430
[ 51.841075] x25: 0000000000000001 x24: ffffffc077a3f200
[ 51.846401] x23: 0000000000000000 x22: 000000010001573e
[ 51.851727] x21: ffffffc0786b7a40 x20: ffffffc078611800
[ 51.857053] x19: ffffffc0786b7a20 x18: 0000000000000000
[ 51.862378] x17: 0000000000000000 x16: 0000000000000000
[ 51.867704] x15: 0000000000000002 x14: 0000000000000000
[ 51.873030] x13: 0000000000000000 x12: 0000000000000002
[ 51.878355] x11: ffffff80089df458 x10: 0000000000000860
[ 51.883681] x9 : ffffffc0786b4000 x8 : ffffffc078c0bac0
[ 51.889006] x7 : ffffffc07bf257a0 x6 : 0000000000000000
[ 51.894333] x5 : 000000000000000d x4 : 0000000040000543
[ 51.899658] x3 : 0000000000000195 x2 : ffffff8008d666d8
[ 51.904983] x1 : 0000000000000002 x0 : 0000000000000000
[ 51.910308]
[ 51.911797] ---[ end trace 2c3d9b4e1f4fc437 ]---
[ 51.916412] Call trace:
[ 51.918858] Exception stack(0xffffffc0786b77e0 to 0xffffffc0786b7910)
[ 51.925300] 77e0: ffffffc0786b7a20 0000008000000000
ffffffc0786b79b0 ffffff80086c3090
[ 51.933131] 7800: ffffffc0786b7850 ffffff80089b8b88
ffffffc07bf25780 0000000000000000
[ 51.940962] 7820: ffffffc078c0b200 ffffffc0387e1900
ffffffc077bdd400 ffffff80089b921c
[ 51.948793] 7840: 0000000000000001 ffffffc07bf1f090
ffffffc0786b78b0 ffffff80089b921c
[ 51.956625] 7860: ffffffc0786b4000 0000008000000000
ffffffc0786b7a00 ffffff80086c37ec
[ 51.964456] 7880: 0000000000000000 0000000000000002
ffffff8008d666d8 0000000000000195
[ 51.972287] 78a0: 0000000040000543 000000000000000d
0000000000000000 ffffffc07bf257a0
[ 51.980118] 78c0: ffffffc078c0bac0 ffffffc0786b4000
0000000000000860 ffffff80089df458
[ 51.987949] 78e0: 0000000000000002 0000000000000000
0000000000000000 0000000000000002
[ 51.995778] 7900: 0000000000000000 0000000000000000
[ 52.000656] [<ffffff80086c3090>] mmc_start_request+0x100/0x108
[ 52.006490] [<ffffff80086c30e8>] __mmc_start_req+0x50/0x7c
[ 52.011976] [<ffffff80086c37cc>] mmc_wait_for_cmd+0x50/0x74
[ 52.017550] [<ffffff80086d2db8>] get_card_status.isra.30+0x50/0x68
[ 52.023731] [<ffffff80086d2e44>] card_busy_detect.isra.31+0x74/0x150
[ 52.030086] [<ffffff80086d2ff4>] mmc_blk_err_check+0xd4/0x544
[ 52.035833] [<ffffff80086c3e44>] mmc_start_req+0xdc/0x3c8
[ 52.041233] [<ffffff80086d364c>] mmc_blk_issue_rw_rq+0xac/0x9b0
[ 52.047153] [<ffffff80086d4188>] mmc_blk_issue_rq+0x238/0x4d8
[ 52.052901] [<ffffff80086d4e00>] mmc_queue_thread+0x98/0x158
[ 52.058562] [<ffffff80080d04dc>] kthread+0xd0/0xe4
[ 52.063355] [<ffffff8008082e90>] ret_from_fork+0x10/0x40

[ 52.068862] JDB: !req: 1 !flags&MMC_QUEUE_NEW_REQUEST: 1
is_special(req): 0
[ 52.076168] ------------[ cut here ]------------
[ 52.080791] WARNING: CPU: 0 PID: 1273 at
drivers/mmc/core/core.c:991 mmc_release_host+0xa0/0xa8
[ 52.089488]
[ 52.090979] CPU: 0 PID: 1273 Comm: mmcqd/0 Tainted: G W
4.8.0-rc1-00056-gf1f4a8e-dirty #791
[ 52.100458] Hardware name: HiKey Development Board (DT)
[ 52.105682] task: ffffffc078c0b200 task.stack: ffffffc0786b4000
[ 52.111601] PC is at mmc_release_host+0xa0/0xa8
[ 52.116131] LR is at mmc_put_card+0x18/0x3c
[ 52.120312] pc : [<ffffff80086c34d8>] lr : [<ffffff80086c417c>]
pstate: 80000145
[ 52.127707] sp : ffffffc0786b7d20
[ 52.131019] x29: ffffffc0786b7d20 x28: 0000000000000000
[ 52.136346] x27: ffffffc078613018 x26: ffffffc077a3f140
[ 52.141672] x25: ffffffc078613028 x24: ffffffc0786b4000
[ 52.146998] x23: ffffffc078611800 x22: ffffffc078613000
[ 52.152323] x21: 0000000000000001 x20: ffffffc078612800
[ 52.157648] x19: ffffffc078611800 x18: 0000000000000000
[ 52.162973] x17: 0000000000000000 x16: 0000000000000000
[ 52.168298] x15: 0000000000001244 x14: 69202031203a5453
[ 52.173623] x13: 45555145525f5745 x12: 4e5f45554555515f
[ 52.178948] x11: 434d4d267367616c x10: 662120202031203a
[ 52.184274] x9 : 71657221203a4244 x8 : 203a29716572286c
[ 52.189598] x7 : 0000000000001245 x6 : ffffff8008f5a90b
[ 52.194924] x5 : 0000000000000000 x4 : 0000000000000000
[ 52.200249] x3 : 0000000000000002 x2 : 0000000000000001
[ 52.205574] x1 : 0000000000000000 x0 : 0000000000000000
[ 52.210899]
[ 52.212387] ---[ end trace 2c3d9b4e1f4fc438 ]---
[ 52.217002] Call trace:
[ 52.219448] Exception stack(0xffffffc0786b7b50 to 0xffffffc0786b7c80)
[ 52.225889] 7b40:
ffffffc078611800 0000008000000000
[ 52.233719] 7b60: ffffffc0786b7d20 ffffff80086c34d8
ffffff8008cc39b8 0000000000000000
[ 52.241551] 7b80: 0000000000000000 0000000000000042
0000000000000140 ffffffc0786b7c90
[ 52.249382] 7ba0: ffffff8008d39000 0000000000000042
0000000000000000 ffffff8008b5a270
[ 52.257214] 7bc0: 0000000100000004 ffffff8008cc39f8
ffffffc0786b7c80 ffffff80080f7e3c
[ 52.265045] 7be0: ffffffc078611cc8 ffffffc078612800
0000000000000000 0000000000000000
[ 52.272876] 7c00: 0000000000000001 0000000000000002
0000000000000000 0000000000000000
[ 52.280707] 7c20: ffffff8008f5a90b 0000000000001245
203a29716572286c 71657221203a4244
[ 52.288538] 7c40: 662120202031203a 434d4d267367616c
4e5f45554555515f 45555145525f5745
[ 52.296369] 7c60: 69202031203a5453 0000000000001244
0000000000000000 0000000000000000
[ 52.304199] [<ffffff80086c34d8>] mmc_release_host+0xa0/0xa8
[ 52.309772] [<ffffff80086c417c>] mmc_put_card+0x18/0x3c
[ 52.315000] [<ffffff80086d4094>] mmc_blk_issue_rq+0x144/0x4d8
[ 52.320748] [<ffffff80086d4e00>] mmc_queue_thread+0x98/0x158
[ 52.326409] [<ffffff80080d04dc>] kthread+0xd0/0xe4
[ 52.331201] [<ffffff8008082e90>] ret_from_fork+0x10/0x40


>> Anyway, I'll do some further testing tomorrow w/ that removed. Usually
>> I see the issue 1-2 times an hour, so if I go the day w/o a problem
>> I'll let you know.
>>
>> Zhangfei/Guodong: Any ideas as to why ERASE might cause trouble on HiKey?
>
> Did you try to send the Erase command directly? e,g fstrim or other things?

Android's vold might be, but I'm not.

> Is it occurred on every booting time?

Unfortunately no. Sometimes it happens shortly after bootup, sometimes
I've seen it after suspend/resume cycle. Sometimes it triggers after a
while leaving the board idle. I've not sorted out the pattern.

thanks
-john