Re: [PATCH v3] remoteproc: imx_dsp_rproc: Add support of recovery and coredump process
From: Iuliana Prodan
Date: Mon Jul 28 2025 - 11:09:43 EST
On 7/28/2025 5:14 PM, Mathieu Poirier wrote:
On Mon, Jul 28, 2025 at 01:39:38PM +0300, Daniel Baluta wrote:
On Tue, Jul 22, 2025 at 11:16 AM Shengjiu Wang <shengjiu.wang@xxxxxxx> wrote:
When enabled FW recovery, but is broken because software reset is missed
in this recovery flow. So move software reset from
imx_dsp_runtime_resume() to .load() and clear memory before loading
firmware to make recovery work.
Add call rproc_coredump_set_elf_info() to initialize the elf info for
coredump, otherwise coredump will report error "ELF class is not set".
Fixes: ec0e5549f358 ("remoteproc: imx_dsp_rproc: Add remoteproc driver for DSP on i.MX")
Signed-off-by: Shengjiu Wang <shengjiu.wang@xxxxxxx>
Changes looks good to me:
I agree, but this is not enough.
Reviewed-by: Daniel Baluta <daniel.baluta@xxxxxxx>
I've tested it with Zephyr synchronization samples inducing a crash
via debugfs interface. App
can recover correctly.
The synchronization sample does not utilize the Messaging Unit (MU) for
communication between the two cores, its behavior is similar to the
basic hello_world example (no fw_ready reply is expected by host).
I’ve tested this patch with both the synchronization and hello_world
samples, as well as with the default firmware specified in the device
tree (imx/dsp/hifi4.bin), and everything works as expected.
However, when testing with the openamp_rsc_table sample from Zephyr [1],
I encountered the following issue:
```
[ 1500.964232] remoteproc remoteproc0: crash detected in imx-dsp-rproc:
type watchdog
[ 1500.964595] remoteproc remoteproc0: handling crash #1 in imx-dsp-rproc
[ 1500.964608] remoteproc remoteproc0: recovering imx-dsp-rproc
[ 1500.965959] remoteproc remoteproc0: stopped remote processor
imx-dsp-rproc
[ 1501.251897] remoteproc remoteproc0: can't start rproc imx-dsp-rproc: -110
```
Upon debugging, I discovered that the issue stems from the imx-mailbox
driver not clearing the General Purpose Interrupt (GPI) bits. This leads
to the remote processor failing to restart properly.
To ensure compatibility across all firmware variants, including those
using OpenAMP, the attached patch is required. Both the recovery and
mailbox patches have been successfully tested on the following
platforms: i.MX8MP, i.MX8ULP, i.MX8QM and i.MX8QXP.
Shengjiu, do you want to send a new version with both patches?
Thanks,
Iulia
Very good - I will merge this around 6.17-rc2 when I get back from vacation.
Mathieu
[1]
https://github.com/zephyrproject-rtos/zephyr/tree/main/samples/subsys/ipc/openamp_rsc_table
From 47786070f1ffbd73f4ff0009e2dbddc79d607e86 Mon Sep 17 00:00:00 2001
From: Iuliana Prodan <iuliana.prodan@xxxxxxx>
Date: Mon, 28 Jul 2025 15:21:24 +0300
Subject: [PATCH 4/4] mailbox: imx: Clear pending bits for the GPIs that are
not enabled
Enhance the i.MX Messaging Unit interrupt service routine
to properly handle general-purpose interrupts (GIP) that
are pending but have their corresponding enable bits (GIEn)
cleared.
This ensures that we can notify the host - such as sending
a fw_ready reply from the DSP remote core - on the second
or any subsequent startup.
Signed-off-by: Iuliana Prodan <iuliana.prodan@xxxxxxx>
---
drivers/mailbox/imx-mailbox.c | 31 +++++++++++++++++++++++++++++--
1 file changed, 29 insertions(+), 2 deletions(-)
diff --git a/drivers/mailbox/imx-mailbox.c b/drivers/mailbox/imx-mailbox.c
index 6b9dbd6a337a..2d1d81545673 100644
--- a/drivers/mailbox/imx-mailbox.c
+++ b/drivers/mailbox/imx-mailbox.c
@@ -40,6 +40,9 @@
#define IMX_MU_SECO_TX_TOUT (msecs_to_jiffies(3000))
#define IMX_MU_SECO_RX_TOUT (msecs_to_jiffies(3000))
+/* 4 general-purpose interrupt requests reflected to the other side */
+#define IMX_MU_GIP_NO 4
+
/* Please not change TX & RX */
enum imx_mu_chan_type {
IMX_MU_TYPE_TX = 0, /* Tx */
@@ -143,7 +146,7 @@ struct imx_mu_dcfg {
/* MU reset */
#define IMX_MU_xCR_RST(type) (type & IMX_MU_V2 ? BIT(0) : BIT(5))
#define IMX_MU_xSR_RST(type) (type & IMX_MU_V2 ? BIT(0) : BIT(7))
-
+#define IMX_MU_xSR_BRDIP(type) (type & IMX_MU_V2 ? BIT(0) : BIT(9))
static struct imx_mu_priv *to_imx_mu_priv(struct mbox_controller *mbox)
{
@@ -530,7 +533,31 @@ static irqreturn_t imx_mu_isr(int irq, void *p)
struct mbox_chan *chan = p;
struct imx_mu_priv *priv = to_imx_mu_priv(chan->mbox);
struct imx_mu_con_priv *cp = chan->con_priv;
- u32 val, ctrl;
+ u32 i, val, ctrl;
+ u32 gips = 0, gies = 0;
+ u32 mu_cr = imx_mu_read(priv, priv->dcfg->xCR[IMX_MU_GCR]);
+ u32 mu_sr = imx_mu_read(priv, priv->dcfg->xSR[IMX_MU_GSR]);
+ u32 brdip = IMX_MU_xSR_BRDIP(priv->dcfg->type);
+
+ for (i = 0; i < IMX_MU_GIP_NO; i++) {
+ gips |= IMX_MU_xSR_GIPn(priv->dcfg->type, i);
+ gies |= IMX_MU_xCR_GIEn(priv->dcfg->type, i);
+ }
+ /* Keep only GIEn bits that are disabled */
+ gies &= (~mu_cr);
+ /* Keep only GIPn bits that are pending */
+ gips &= mu_sr;
+ /* Keep only GIPn bits that have the corresponding GIEn bits disabled */
+ gips &= gies;
+
+ /*
+ * Clear the BRDIP bit, processor B-side is out of reset,
+ * which also clears general purpose interrupt 3
+ */
+ if (mu_sr & brdip)
+ gips |= brdip;
+ /* Clear pending bits for the general purpose interrupts that are not enabled */
+ imx_mu_write(priv, gips, priv->dcfg->xSR[IMX_MU_GSR]);
switch (cp->type) {
case IMX_MU_TYPE_TX:
--
2.25.1