[PATCH v1 00/10] Support for GMU coredump and some related improvements

From: Akhil P Oommen
Date: Wed Mar 02 2022 - 12:28:13 EST


Major enhancement in this series is the support for a minimal gmu coredump
which can be captured inline instead of through our usual recover worker. It
is helpful in the case of gmu errors during gpu wake-up/suspend path and
helps to capture a snapshot of gmu before we do a suspend. I had to introduce
a lock to synchronize the crashstate because the runtime-suspend can happen
from an asynchronous RPM thread.

Apart from this, there are some improvements to gracefully handle the
gmu errors by propagating the error back to parent or by retrying. Also, a
few patches to fix some trivial bugs in the related code.


Akhil P Oommen (10):
drm/msm/a6xx: Add helper to check smmu is stalled
drm/msm/a6xx: Send NMI to gmu when it is hung
drm/msm/a6xx: Avoid gmu lock in pm ops
drm/msm/a6xx: Enhance debugging of gmu faults
drm/msm: Do recovery on hw_init failure
drm/msm/a6xx: Propagate OOB set error
drm/msm/adreno: Retry on gpu resume failure
drm/msm/a6xx: Remove clk votes on failure
drm/msm: Remove pm_runtime_get() from msm_job_run()
drm/msm/a6xx: Free gmu_debug crashstate bo

drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 89 +++++++++++++++++++++++------
drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 1 +
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 31 +++++++---
drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 4 +-
drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 79 +++++++++++++++++++++----
drivers/gpu/drm/msm/adreno/adreno_device.c | 10 +++-
drivers/gpu/drm/msm/adreno/adreno_gpu.c | 10 +++-
drivers/gpu/drm/msm/adreno/adreno_gpu.h | 2 +
drivers/gpu/drm/msm/msm_gpu.c | 28 ++++++++-
drivers/gpu/drm/msm/msm_gpu.h | 11 ++--
drivers/gpu/drm/msm/msm_ringbuffer.c | 4 --
11 files changed, 218 insertions(+), 51 deletions(-)

--
2.7.4