Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

From: Andy Shevchenko
Date: Wed May 20 2020 - 04:32:50 EST


On Wed, May 20, 2020 at 8:40 AM Emmanuel Grumbach <egrumbach@xxxxxxxxx> wrote:

> Since I have been involved quite a bit in the firmware debugging
> features in iwlwifi, I think I can give a few insights here.
>
> But before this, we need to understand that there are several sources of issues:
> 1) the firmware may crash but the bus is still alive, you can still
> use the bus to get the crash data
> 2) the bus is dead, when that happens, the firmware might even be in a
> good condition, but since the bus is dead, you stop getting any
> information about the firmware, and then, at some point, you get to
> the conclusion that the firmware is dead. You can't get the crash data
> that resides on the other side of the bus (you may have gathered data
> in the DRAM directly, but that's a different thing), and you don't
> have much recovery to do besides re-starting the PCI enumeration.
>
> At Intel, we have seen both unfortunately. The bus issues are the ones
> that are trickier obviously. Trickier to detect (because you just get
> garbage from any request you issue on the bus), and trickier to
> handle. One can argue that the kernel should *not* handle those and
> let this in userspace hands. I guess it all depends on what component
> you ship to your customer and what you customer asks from you :).

Or the two best approaches:
1) get rid of firmware completely;
2) make it OSS (like SOF).

I think any of these is a right thing to do in long-term perspective.

How many firmwares average computer has? 50? 100? Any of them is a
burden and PITA.

--
With Best Regards,
Andy Shevchenko