[PATCH 3/4] x86/boot/compressed/64: Use page table in trampoline memory

From: Kirill A. Shutemov
Date: Mon Mar 12 2018 - 06:03:19 EST


If a bootloader enables 64-bit mode with 4-level paging, we might need to
switch over to 5-level paging. The switching requires the disabling
paging. It works fine if kernel itself is loaded below 4G.

But if the bootloader put the kernel above 4G (i.e. in kexec() case),
we would lose control as soon as paging is disabled, because the code
becomes unreachable to the CPU.

To handle the situation, we need a trampoline in lower memory that would
take care of switching on 5-level paging.

Apart from the trampoline code itself we also need a place to store
top-level page table in lower memory as we don't have a way to load
64-bit values into CR3 in 32-bit mode. We only really need 8 bytes there
as we only use the very first entry of the page table. But we allocate a
whole page anyway.

This patch switches 32-bit code to use page table in trampoline memory.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
---
arch/x86/boot/compressed/head_64.S | 47 +++++++++++++++++++-------------------
1 file changed, 23 insertions(+), 24 deletions(-)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 0014459d9bcb..836ed319e995 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -336,23 +336,6 @@ ENTRY(startup_64)
/* Save the trampoline address in RCX */
movq %rax, %rcx

- /* Check if we need to enable 5-level paging */
- cmpq $0, %rdx
- jz lvl5
-
- /* Clear additional page table */
- leaq lvl5_pgtable(%rbx), %rdi
- xorq %rax, %rax
- movq $(PAGE_SIZE/8), %rcx
- rep stosq
-
- /*
- * Setup current CR3 as the first and only entry in a new top level
- * page table.
- */
- movq %cr3, %rdi
- leaq 0x7 (%rdi), %rax
- movq %rax, lvl5_pgtable(%rbx)

/* Switch to compatibility mode (CS.L = 0 CS.D = 1) via far return */
pushq $__KERNEL32_CS
@@ -524,13 +507,31 @@ compatible_mode:
btrl $X86_CR0_PG_BIT, %eax
movl %eax, %cr0

- /* Point CR3 to 5-level paging */
- leal lvl5_pgtable(%ebx), %eax
- movl %eax, %cr3
+ /* Check what paging mode we want to be in after the trampoline */
+ cmpl $0, %edx
+ jz 1f

- /* Enable PAE and LA57 mode */
+ /* We want 5-level paging: don't touch CR3 if it already points to 5-level page tables */
movl %cr4, %eax
- orl $(X86_CR4_PAE | X86_CR4_LA57), %eax
+ testl $X86_CR4_LA57, %eax
+ jnz 3f
+ jmp 2f
+1:
+ /* We want 4-level paging: don't touch CR3 if it already points to 4-level page tables */
+ movl %cr4, %eax
+ testl $X86_CR4_LA57, %eax
+ jz 3f
+2:
+ /* Point CR3 to the trampoline's new top level page table */
+ leal TRAMPOLINE_32BIT_PGTABLE_OFFSET(%ecx), %eax
+ movl %eax, %cr3
+3:
+ /* Enable PAE and LA57 (if required) paging modes */
+ movl $X86_CR4_PAE, %eax
+ cmpl $0, %edx
+ jz 1f
+ orl $X86_CR4_LA57, %eax
+1:
movl %eax, %cr4

/* Calculate address we are running at */
@@ -611,5 +612,3 @@ boot_stack_end:
.balign 4096
pgtable:
.fill BOOT_PGT_SIZE, 1, 0
-lvl5_pgtable:
- .fill PAGE_SIZE, 1, 0
--
2.16.1