Re: [PATCH 2/2] ARM: OMAP4460: cpuidle: WA for ROM bug because ofCA9 r2pX gic control register change

From: Grygorii Strashko
Date: Thu Oct 17 2013 - 10:49:25 EST


On 10/17/2013 04:57 PM, Santosh Shilimkar wrote:
On Thursday 17 October 2013 05:24 AM, Grygorii Strashko wrote:
On OMAP4+ devices, GIC register context is lost when MPUSS hits the
OSWR. On the CPU wakeup path, ROM code gets executed and one of the
steps in it is to restore the saved context of the GIC.

The ROM code uses GICD != 1 condition to decide how the GIC registers
are handled in wakeup path from OSWR. But, GICD register has changed
between CortexA9 r1pX and r2pX and it contains 2 bits now. Secure view
which ROM code sees:
bit 1 == Enable Non-secure
bit 0 == Enable secure
Non-secure view which HLOS sees:
bit 0 == Enable Non-secure

As result, on OMAP4460(r2pX) devices, when the ROM code is executed
during CPU1 wakeup, GICD == 3 and it fails to understand the real wakeup
power state and reconfigures GIC distributor to boot values and, as
result, the entire interrupt controller context will loose in a live
system.

Hence, implement a workaround on OMAP4460 devices in case if MPUSS has
hit OSWR - as long as CPU1 sees GICD == 1 in it's wakeup path from OSWR,
the issue won't happen:
1.1) CPU0 must disable the GIC distributor, before doing the CPU1
wakeup,
1.2) CPU0 should wait until CPU1 will re-enable the GIC distributor
2) CPU1 must re-enable the GIC distributor on it's wakeup path.

The workaround for CPUIdle has been implemented in the same way as
for boot-up & hot-plug path in:
- http://git.kernel.org/?p=linux/kernel/git/tmlind/linux-omap.git;
a=commitdiff;h=ff999b8a0983ee15668394ed49e38d3568fc6859

For more information see:
- https://patchwork.kernel.org/patch/1609011/
- http://www.spinics.net/lists/arm-kernel/msg201402.html

The ROM code bug is applicable to only OMAP4460(r2pX) devices.
OMAP4470 (also r2pX) is not affected by this bug because ROM code has been
fixed.

Just give reference to the commit which has best description about
the bug and just say applying the fix for idle case.

ff999b8a {ARM: OMAP4460: Workaround for ROM...}

Cc: Santosh Shilimkar <santosh.shilimkar@xxxxxx>
Cc: Kevin Hilman <khilman@xxxxxxxxxx>
Reported-and-Tested-by: Taras Kondratiuk <taras.kondratiuk@xxxxxxxxxx>
Signed-off-by: Grygorii Strashko <grygorii.strashko@xxxxxx>
---
arch/arm/mach-omap2/common.h | 1 +
arch/arm/mach-omap2/cpuidle44xx.c | 34 +++++++++++++++++++++++++++++++++-
arch/arm/mach-omap2/omap4-common.c | 6 ++++++
3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mach-omap2/common.h b/arch/arm/mach-omap2/common.h
index b875a4a..7957110 100644
--- a/arch/arm/mach-omap2/common.h
+++ b/arch/arm/mach-omap2/common.h
@@ -232,6 +232,7 @@ static inline void __iomem *omap4_get_scu_base(void)

extern void __init gic_init_irq(void);
extern void gic_dist_disable(void);
+extern void gic_dist_enable(void);
extern bool gic_dist_disabled(void);
extern void gic_timer_retrigger(void);
extern void omap_smc1(u32 fn, u32 arg);
diff --git a/arch/arm/mach-omap2/cpuidle44xx.c b/arch/arm/mach-omap2/cpuidle44xx.c
index 384aa1c..528638b 100644
--- a/arch/arm/mach-omap2/cpuidle44xx.c
+++ b/arch/arm/mach-omap2/cpuidle44xx.c
@@ -80,6 +80,7 @@ static int omap_enter_idle_coupled(struct cpuidle_device *dev,
int index)
{
struct idle_statedata *cx = state_ptr + index;
+ u32 mpuss_context_lost = 0;

/*
* CPU0 has to wait and stay ON until CPU1 is OFF state.
@@ -126,13 +127,44 @@ static int omap_enter_idle_coupled(struct cpuidle_device *dev,
omap4_enter_lowpower(dev->cpu, cx->cpu_state);
cpu_done[dev->cpu] = true;

+ mpuss_context_lost = omap4_mpuss_read_prev_context_state();
+
Just use the targeted state since couple idle almost grantees
the low power entry. Even in failed case, applying errata sequence
is harmless.

/* Wakeup CPU1 only if it is not offlined */
if (dev->cpu == 0 && cpumask_test_cpu(1, cpu_online_mask)) {
+ /*
+ * GIC distributor control register has changed between
+ * CortexA9 r1pX and r2pX. The Control Register secure
+ * banked version is now composed of 2 bits:
+ * bit 0 == Secure Enable
+ * bit 1 == Non-Secure Enable
+ * The Non-Secure banked register has not changed
+ * Because the ROM Code is based on the r1pX GIC, the CPU1
+ * GIC restoration will cause a problem to CPU0 Non-Secure SW.
+ * The workaround must be:
+ * 1) Before doing the CPU1 wakeup, CPU0 must disable
+ * the GIC distributor and wait until it will be enabled by CPU1
+ * 2) CPU1 must re-enable the GIC distributor on
+ * it's wakeup path.
+ */
+ if (IS_PM44XX_ERRATUM(PM_OMAP4_ROM_SMP_BOOT_ERRATUM_GICD) &&
+ mpuss_context_lost)
Use target state here..

+ gic_dist_disable();
+
clkdm_wakeup(cpu_clkdm[1]);
omap_set_pwrdm_state(cpu_pd[1], PWRDM_POWER_ON);
clkdm_allow_idle(cpu_clkdm[1]);
+
+ if (IS_PM44XX_ERRATUM(PM_OMAP4_ROM_SMP_BOOT_ERRATUM_GICD) &&
+ mpuss_context_lost)
+ while (gic_dist_disabled()) {
+ udelay(1);
+ cpu_relax();
+ }
Am surprised you didn't live lock here. CPUs can wait here infinitely

yes. Thats why i did revert in patch 1 to guarantee that this loop will be executed at right time.

because the distributor isn't turned on in idle case. Suspend case,
CPU1 on its boot-up will enable distributor and hence everything works
with above check.


}

+ if (IS_PM44XX_ERRATUM(PM_OMAP4_ROM_SMP_BOOT_ERRATUM_GICD) && dev->cpu)
+ gic_dist_enable();
+
Just move this code under omap-mpuss-lowpower.c and then things should work
properly. It won't affect suspend code since suspend path is different.


ok. I will do. Thanks for your review.

Thanks for the fix though...

Regards,
Santosh


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/