Re: [PATCH] brcmfmac: abort and release host after error

From: Guenter Roeck
Date: Tue Jan 28 2020 - 23:17:46 EST


On 1/28/20 7:32 PM, Dan Carpenter wrote:
On Tue, Jan 28, 2020 at 02:14:57PM -0800, Guenter Roeck wrote:
With commit 216b44000ada ("brcmfmac: Fix use after free in
brcmf_sdio_readframes()") applied, we see locking timeouts in
brcmf_sdio_watchdog_thread().

brcmfmac: brcmf_escan_timeout: timer expired
INFO: task brcmf_wdog/mmc1:621 blocked for more than 120 seconds.
Not tainted 4.19.94-07984-g24ff99a0f713 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
brcmf_wdog/mmc1 D 0 621 2 0x00000000 last_sleep: 2440793077. last_runnable: 2440766827
[<c0aa1e60>] (__schedule) from [<c0aa2100>] (schedule+0x98/0xc4)
[<c0aa2100>] (schedule) from [<c0853830>] (__mmc_claim_host+0x154/0x274)
[<c0853830>] (__mmc_claim_host) from [<bf10c5b8>] (brcmf_sdio_watchdog_thread+0x1b0/0x1f8 [brcmfmac])
[<bf10c5b8>] (brcmf_sdio_watchdog_thread [brcmfmac]) from [<c02570b8>] (kthread+0x178/0x180)

In addition to restarting or exiting the loop, it is also necessary to
abort the command and to release the host.

Fixes: 216b44000ada ("brcmfmac: Fix use after free in brcmf_sdio_readframes()")

Huh... Thanks for fixing the bug. That seems to indicate that we were
triggering the use after free but no one noticed at runtime. With

Actually, we did see the problem. We just didn't realize it.

kfree(), a use after free can be harmless if you don't have poisoning
enabled and no other thread has re-used the memory. I'm not sure about
kfree_skb() but presumably it's the same.


Not really; it ultimately does result in a crash. We see that in ChromeOS
R80 (and probably in all earlier releases, but I didn't check), which does
not (yet) include 216b44000ada. The upcoming R81, which does include
216b44000ada, doesn't crash but there are lots of stalls like the one
above. The combination of both (ie the difference in behavior) helped
tracking down the problem.

Guenter