Re: mainline/master bisection: baseline.login on meson-sm1-khadas-vim3l

From: Marc Zyngier
Date: Tue Feb 23 2021 - 09:20:07 EST


Hi Guillaume,

On Tue, 23 Feb 2021 09:46:30 +0000,
Guillaume Tucker <guillaume.tucker@xxxxxxxxxxxxx> wrote:
>
> Hello Marc,
>
> Please see the bisection report below about a boot failure on
> meson-sm1-khadas-vim3l on mainline. It seems to only be
> affecting kernels built with CONFIG_ARM64_64K_PAGES=y.
>
> Reports aren't automatically sent to the public while we're
> trialing new bisection features on kernelci.org but this one
> looks valid.
>
> There's no output in the log, so the kernel is most likely
> crashing early. Some more details can be found here:
>
> https://kernelci.org/test/case/id/6034bed3b344e2860daddcc8/
>
> Please let us know if you need any help to debug the issue or try
> a fix on this platform.

Thanks for the heads up.

There is actually a fundamental problem with the patch you bisected
to: it provides no guarantee that the point where we enable the EL2
MMU is in the idmap and, as it turns out, the code we're running from
disappears from under our feet, leading to a translation fault we're
not prepared to handle.

How does it work with 4kB pages? Luck.

Do you mind giving the patch below a go? It does work on my vim3l and
on a FVP, so odds are that it will solve it for you too.

Thanks,

M.

diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 678cd2c618ee..fbd2543b8f7d 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -96,8 +96,10 @@ SYM_CODE_START_LOCAL(mutate_to_vhe)
cmp x1, xzr
and x2, x2, x1
csinv x2, x2, xzr, ne
- cbz x2, 1f
+ cbnz x2, 2f

+1: eret
+2:
// Engage the VHE magic!
mov_q x0, HCR_HOST_VHE_FLAGS
msr hcr_el2, x0
@@ -131,11 +133,29 @@ SYM_CODE_START_LOCAL(mutate_to_vhe)
msr mair_el1, x0
isb

+ // Hack the exception return to stay at EL2
+ mrs x0, spsr_el1
+ and x0, x0, #~PSR_MODE_MASK
+ mov x1, #PSR_MODE_EL2h
+ orr x0, x0, x1
+ msr spsr_el1, x0
+
+ b enter_vhe
+SYM_CODE_END(mutate_to_vhe)
+
+ // At the point where we reach enter_vhe(), we run with
+ // the MMU off (which is enforced by mutate_to_vhe()).
+ // We thus need to be in the idmap, or everything will
+ // explode when enabling the MMU.
+
+ .pushsection .idmap.text, "ax"
+
+SYM_CODE_START_LOCAL(enter_vhe)
+ // Enable the EL2 S1 MMU, as set up from EL1
// Invalidate TLBs before enabling the MMU
tlbi vmalle1
dsb nsh

- // Enable the EL2 S1 MMU, as set up from EL1
mrs_s x0, SYS_SCTLR_EL12
set_sctlr_el1 x0

@@ -143,17 +163,12 @@ SYM_CODE_START_LOCAL(mutate_to_vhe)
mov_q x0, INIT_SCTLR_EL1_MMU_OFF
msr_s SYS_SCTLR_EL12, x0

- // Hack the exception return to stay at EL2
- mrs x0, spsr_el1
- and x0, x0, #~PSR_MODE_MASK
- mov x1, #PSR_MODE_EL2h
- orr x0, x0, x1
- msr spsr_el1, x0
-
mov x0, xzr

-1: eret
-SYM_CODE_END(mutate_to_vhe)
+ eret
+SYM_CODE_END(enter_vhe)
+
+ .popsection

.macro invalid_vector label
SYM_CODE_START_LOCAL(\label)


--
Without deviation from the norm, progress is not possible.