Re: [Patch V2] memory: tegra: Add MC error logging on tegra186 onward

From: Ashish Mhetre
Date: Fri Jan 21 2022 - 00:50:34 EST




On 1/20/2022 6:23 PM, Dmitry Osipenko wrote:
External email: Use caution opening links or attachments


20.01.2022 12:02, Ashish Mhetre пишет:
Remove static from tegra30_mc_handle_irq and use it as interrupt handler
for MC interrupts on tegra186, tegra194 and tegra234 to log the errors.
Add error specific MC status and address register bits and use them on
tegra186, tegra194 and tegra234.
Add error logging for generalized carveout interrupt on tegra186, tegra194
and tegra234.
Add error logging for route sanity interrupt on tegra194 an tegra234.
Add register for higher bits of error address and use it on tegra194 and
tegra234.

Signed-off-by: Ashish Mhetre <amhetre@xxxxxxxxxx>
---
Changes in v2:
- Updated patch subject and commit message
- Removed separate irq handlers
- Updated tegra30_mc_handle_irq to be used for Tegra186 onwards as well

drivers/memory/tegra/mc.c | 73 ++++++++++++++++++++++++++++++++++-------
drivers/memory/tegra/mc.h | 16 +++++++++
drivers/memory/tegra/tegra186.c | 7 ++++
drivers/memory/tegra/tegra194.c | 5 +++
drivers/memory/tegra/tegra234.c | 5 +++
5 files changed, 94 insertions(+), 12 deletions(-)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index bf3abb6..badebe8 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -508,7 +508,21 @@ int tegra30_mc_probe(struct tegra_mc *mc)
return 0;
}

-static irqreturn_t tegra30_mc_handle_irq(int irq, void *data)
+const struct tegra_mc_ops tegra30_mc_ops = {
+ .probe = tegra30_mc_probe,
+ .handle_irq = tegra30_mc_handle_irq,
+};
+#endif
+
+#if defined(CONFIG_ARCH_TEGRA_3x_SOC) || \
+ defined(CONFIG_ARCH_TEGRA_114_SOC) || \
+ defined(CONFIG_ARCH_TEGRA_124_SOC) || \
+ defined(CONFIG_ARCH_TEGRA_132_SOC) || \
+ defined(CONFIG_ARCH_TEGRA_210_SOC) || \
+ defined(CONFIG_ARCH_TEGRA_186_SOC) || \
+ defined(CONFIG_ARCH_TEGRA_194_SOC) || \
+ defined(CONFIG_ARCH_TEGRA_234_SOC)

Ifdefs are unnecessary, please remove them. They are okay for
tegra30_mc_ops, which is known to be used only by specific older SoC
versions, not okay in case of newer SoCs.

Okay, I'll remove these ifdefs in next version.

+irqreturn_t tegra30_mc_handle_irq(int irq, void *data)
{
struct tegra_mc *mc = data;
unsigned long status;
@@ -521,23 +535,64 @@ static irqreturn_t tegra30_mc_handle_irq(int irq, void *data)

for_each_set_bit(bit, &status, 32) {
const char *error = tegra_mc_status_names[bit] ?: "unknown";
+ u32 status_reg = MC_ERR_STATUS, addr_reg = MC_ERR_ADR;
const char *client = "unknown", *desc;
const char *direction, *secure;
phys_addr_t addr = 0;
+ u32 addr_hi_reg = 0;
unsigned int i;
char perm[7];
u8 id, type;
u32 value;

- value = mc_readl(mc, MC_ERR_STATUS);
+#if defined(CONFIG_ARCH_TEGRA_186_SOC) || \
+ defined(CONFIG_ARCH_TEGRA_194_SOC) || \
+ defined(CONFIG_ARCH_TEGRA_234_SOC)

Please drop these ifdefs.

Yes, will do in next version.
+ switch (bit) {
+ case MC_INT_DECERR_VPR:
+ status_reg = MC_ERR_VPR_STATUS;
+ addr_reg = MC_ERR_VPR_ADR;

I mentioned previously that VPR is supported by T124+. Hence ifdefs are
incorrect.

Yes, I will remove them.

...
+ addr |= mc_readl(mc, addr_reg);

if (value & MC_ERR_STATUS_RW)
direction = "write";
@@ -591,9 +646,6 @@ static irqreturn_t tegra30_mc_handle_irq(int irq, void *data)
break;
}

- value = mc_readl(mc, MC_ERR_ADR);

Don't change the order of the code, just replace the MC_ERR_ADR here.

Okay, I'll keep the order as it is. I just thought that higher bits of
address are read above, so why not read lower bits after that.

- addr |= value;
-
dev_err_ratelimited(mc->dev, "%s: %s%s @%pa: %s (%s%s)\n",
client, secure, direction, &addr, error,
desc, perm);
@@ -604,11 +656,6 @@ static irqreturn_t tegra30_mc_handle_irq(int irq, void *data)

return IRQ_HANDLED;
}
-
-const struct tegra_mc_ops tegra30_mc_ops = {
- .probe = tegra30_mc_probe,
- .handle_irq = tegra30_mc_handle_irq,
-};
#endif

const char *const tegra_mc_status_names[32] = {
@@ -622,6 +669,8 @@ const char *const tegra_mc_status_names[32] = {
[12] = "VPR violation",
[13] = "Secure carveout violation",
[16] = "MTS carveout violation",
+ [17] = "Generalized carveout violation",
+ [20] = "Route Sanity error",
};

const char *const tegra_mc_error_names[8] = {
diff --git a/drivers/memory/tegra/mc.h b/drivers/memory/tegra/mc.h
index 062886e..9b1b0dc 100644
--- a/drivers/memory/tegra/mc.h
+++ b/drivers/memory/tegra/mc.h
@@ -44,6 +44,8 @@
#define MC_TIMING_CONTROL_DBG 0xf8
#define MC_TIMING_CONTROL 0xfc

+#define MC_INT_DECERR_ROUTE_SANITY BIT(20)
+#define MC_INT_DECERR_GENERALIZED_CARVEOUT BIT(17)
#define MC_INT_DECERR_MTS BIT(16)
#define MC_INT_SECERR_SEC BIT(13)
#define MC_INT_DECERR_VPR BIT(12)
@@ -65,6 +67,18 @@
#define MC_ERR_STATUS_SECURITY BIT(17)
#define MC_ERR_STATUS_RW BIT(16)

+#define MC_ERR_VPR_STATUS 0x654
+#define MC_ERR_VPR_ADR 0x658
+#define MC_ERR_SEC_STATUS 0x67c
+#define MC_ERR_SEC_ADR 0x680
+#define MC_ERR_MTS_STATUS 0x9b0
+#define MC_ERR_MTS_ADR 0x9b4
+#define MC_ERR_ROUTE_SANITY_STATUS 0x9c0
+#define MC_ERR_ROUTE_SANITY_ADR 0x9c4
+#define MC_ERR_GENERALIZED_CARVEOUT_STATUS 0xc00
+#define MC_ERR_GENERALIZED_CARVEOUT_ADR 0xc04
+#define MC_ERR_ADR_HI 0x11fc

Please put these regs right after the MC_TIMING_CONTROL. There is no
reason to separate them.

Sure, I'll update it in next version.