Re: [Patch V2] memory: tegra: Add MC error logging on tegra186 onward

From: Dmitry Osipenko
Date: Thu Jan 20 2022 - 07:54:27 EST


20.01.2022 12:02, Ashish Mhetre пишет:
> Remove static from tegra30_mc_handle_irq and use it as interrupt handler
> for MC interrupts on tegra186, tegra194 and tegra234 to log the errors.
> Add error specific MC status and address register bits and use them on
> tegra186, tegra194 and tegra234.
> Add error logging for generalized carveout interrupt on tegra186, tegra194
> and tegra234.
> Add error logging for route sanity interrupt on tegra194 an tegra234.
> Add register for higher bits of error address and use it on tegra194 and
> tegra234.
>
> Signed-off-by: Ashish Mhetre <amhetre@xxxxxxxxxx>
> ---
> Changes in v2:
> - Updated patch subject and commit message
> - Removed separate irq handlers
> - Updated tegra30_mc_handle_irq to be used for Tegra186 onwards as well
>
> drivers/memory/tegra/mc.c | 73 ++++++++++++++++++++++++++++++++++-------
> drivers/memory/tegra/mc.h | 16 +++++++++
> drivers/memory/tegra/tegra186.c | 7 ++++
> drivers/memory/tegra/tegra194.c | 5 +++
> drivers/memory/tegra/tegra234.c | 5 +++
> 5 files changed, 94 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
> index bf3abb6..badebe8 100644
> --- a/drivers/memory/tegra/mc.c
> +++ b/drivers/memory/tegra/mc.c
> @@ -508,7 +508,21 @@ int tegra30_mc_probe(struct tegra_mc *mc)
> return 0;
> }
>
> -static irqreturn_t tegra30_mc_handle_irq(int irq, void *data)
> +const struct tegra_mc_ops tegra30_mc_ops = {
> + .probe = tegra30_mc_probe,
> + .handle_irq = tegra30_mc_handle_irq,
> +};
> +#endif
> +
> +#if defined(CONFIG_ARCH_TEGRA_3x_SOC) || \
> + defined(CONFIG_ARCH_TEGRA_114_SOC) || \
> + defined(CONFIG_ARCH_TEGRA_124_SOC) || \
> + defined(CONFIG_ARCH_TEGRA_132_SOC) || \
> + defined(CONFIG_ARCH_TEGRA_210_SOC) || \
> + defined(CONFIG_ARCH_TEGRA_186_SOC) || \
> + defined(CONFIG_ARCH_TEGRA_194_SOC) || \
> + defined(CONFIG_ARCH_TEGRA_234_SOC)

Ifdefs are unnecessary, please remove them. They are okay for
tegra30_mc_ops, which is known to be used only by specific older SoC
versions, not okay in case of newer SoCs.

> +irqreturn_t tegra30_mc_handle_irq(int irq, void *data)
> {
> struct tegra_mc *mc = data;
> unsigned long status;
> @@ -521,23 +535,64 @@ static irqreturn_t tegra30_mc_handle_irq(int irq, void *data)
>
> for_each_set_bit(bit, &status, 32) {
> const char *error = tegra_mc_status_names[bit] ?: "unknown";
> + u32 status_reg = MC_ERR_STATUS, addr_reg = MC_ERR_ADR;
> const char *client = "unknown", *desc;
> const char *direction, *secure;
> phys_addr_t addr = 0;
> + u32 addr_hi_reg = 0;
> unsigned int i;
> char perm[7];
> u8 id, type;
> u32 value;
>
> - value = mc_readl(mc, MC_ERR_STATUS);
> +#if defined(CONFIG_ARCH_TEGRA_186_SOC) || \
> + defined(CONFIG_ARCH_TEGRA_194_SOC) || \
> + defined(CONFIG_ARCH_TEGRA_234_SOC)

Please drop these ifdefs.

> + switch (bit) {
> + case MC_INT_DECERR_VPR:
> + status_reg = MC_ERR_VPR_STATUS;
> + addr_reg = MC_ERR_VPR_ADR;

I mentioned previously that VPR is supported by T124+. Hence ifdefs are
incorrect.

...
> + addr |= mc_readl(mc, addr_reg);
>
> if (value & MC_ERR_STATUS_RW)
> direction = "write";
> @@ -591,9 +646,6 @@ static irqreturn_t tegra30_mc_handle_irq(int irq, void *data)
> break;
> }
>
> - value = mc_readl(mc, MC_ERR_ADR);

Don't change the order of the code, just replace the MC_ERR_ADR here.

> - addr |= value;
> -
> dev_err_ratelimited(mc->dev, "%s: %s%s @%pa: %s (%s%s)\n",
> client, secure, direction, &addr, error,
> desc, perm);
> @@ -604,11 +656,6 @@ static irqreturn_t tegra30_mc_handle_irq(int irq, void *data)
>
> return IRQ_HANDLED;
> }
> -
> -const struct tegra_mc_ops tegra30_mc_ops = {
> - .probe = tegra30_mc_probe,
> - .handle_irq = tegra30_mc_handle_irq,
> -};
> #endif
>
> const char *const tegra_mc_status_names[32] = {
> @@ -622,6 +669,8 @@ const char *const tegra_mc_status_names[32] = {
> [12] = "VPR violation",
> [13] = "Secure carveout violation",
> [16] = "MTS carveout violation",
> + [17] = "Generalized carveout violation",
> + [20] = "Route Sanity error",
> };
>
> const char *const tegra_mc_error_names[8] = {
> diff --git a/drivers/memory/tegra/mc.h b/drivers/memory/tegra/mc.h
> index 062886e..9b1b0dc 100644
> --- a/drivers/memory/tegra/mc.h
> +++ b/drivers/memory/tegra/mc.h
> @@ -44,6 +44,8 @@
> #define MC_TIMING_CONTROL_DBG 0xf8
> #define MC_TIMING_CONTROL 0xfc
>
> +#define MC_INT_DECERR_ROUTE_SANITY BIT(20)
> +#define MC_INT_DECERR_GENERALIZED_CARVEOUT BIT(17)
> #define MC_INT_DECERR_MTS BIT(16)
> #define MC_INT_SECERR_SEC BIT(13)
> #define MC_INT_DECERR_VPR BIT(12)
> @@ -65,6 +67,18 @@
> #define MC_ERR_STATUS_SECURITY BIT(17)
> #define MC_ERR_STATUS_RW BIT(16)
>
> +#define MC_ERR_VPR_STATUS 0x654
> +#define MC_ERR_VPR_ADR 0x658
> +#define MC_ERR_SEC_STATUS 0x67c
> +#define MC_ERR_SEC_ADR 0x680
> +#define MC_ERR_MTS_STATUS 0x9b0
> +#define MC_ERR_MTS_ADR 0x9b4
> +#define MC_ERR_ROUTE_SANITY_STATUS 0x9c0
> +#define MC_ERR_ROUTE_SANITY_ADR 0x9c4
> +#define MC_ERR_GENERALIZED_CARVEOUT_STATUS 0xc00
> +#define MC_ERR_GENERALIZED_CARVEOUT_ADR 0xc04
> +#define MC_ERR_ADR_HI 0x11fc

Please put these regs right after the MC_TIMING_CONTROL. There is no
reason to separate them.