Re: [tip:x86/mm] x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up

From: Jan Kiszka
Date: Mon May 08 2017 - 02:33:33 EST


On 2017-03-23 10:14, tip-bot for Andy Lutomirski wrote:
> The x86 smpboot trampoline expects initial_page_table to have the
> GDT mapped. If the GDT ends up in a virtually mapped per-cpu page,
> then it won't be in the page tables at all until perc-pu areas are
> set up. The result will be a triple fault the first time that the
> CPU attempts to access the GDT after LGDT loads the perc-pu GDT.
>
> This appears to be an old bug, but somehow the GDT fixmap rework
> is triggering it. This seems to have something to do with the
> memory layout.
>
> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
> Cc: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx>
> Cc: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
> Cc: Borislav Petkov <bp@xxxxxxxxx>
> Cc: Brian Gerst <brgerst@xxxxxxxxx>
> Cc: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
> Cc: H. Peter Anvin <hpa@xxxxxxxxx>
> Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
> Cc: Juergen Gross <jgross@xxxxxxxx>
> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Cc: Matt Fleming <matt@xxxxxxxxxxxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Thomas Garnier <thgarnie@xxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: linux-efi@xxxxxxxxxxxxxxx
> Link: http://lkml.kernel.org/r/a553264a5972c6a86f9b5caac237470a0c74a720.1490218061.git.luto@xxxxxxxxxx
> Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
> ---
> arch/x86/kernel/setup.c | 15 ---------------
> arch/x86/kernel/setup_percpu.c | 21 +++++++++++++++++++++
> 2 files changed, 21 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 4bf0c89..56b1177 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -1226,21 +1226,6 @@ void __init setup_arch(char **cmdline_p)
>
> kasan_init();
>
> -#ifdef CONFIG_X86_32
> - /* sync back kernel address range */
> - clone_pgd_range(initial_page_table + KERNEL_PGD_BOUNDARY,
> - swapper_pg_dir + KERNEL_PGD_BOUNDARY,
> - KERNEL_PGD_PTRS);
> -
> - /*
> - * sync back low identity map too. It is used for example
> - * in the 32-bit EFI stub.
> - */
> - clone_pgd_range(initial_page_table,
> - swapper_pg_dir + KERNEL_PGD_BOUNDARY,
> - min(KERNEL_PGD_PTRS, KERNEL_PGD_BOUNDARY));
> -#endif
> -
> tboot_probe();
>
> map_vsyscall();
> diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
> index 11338b0..bb1e8cc 100644
> --- a/arch/x86/kernel/setup_percpu.c
> +++ b/arch/x86/kernel/setup_percpu.c
> @@ -288,4 +288,25 @@ void __init setup_per_cpu_areas(void)
>
> /* Setup cpu initialized, callin, callout masks */
> setup_cpu_local_masks();
> +
> +#ifdef CONFIG_X86_32
> + /*
> + * Sync back kernel address range. We want to make sure that
> + * all kernel mappings, including percpu mappings, are available
> + * in the smpboot asm. We can't reliably pick up percpu
> + * mappings using vmalloc_fault(), because exception dispatch
> + * needs percpu data.
> + */
> + clone_pgd_range(initial_page_table + KERNEL_PGD_BOUNDARY,
> + swapper_pg_dir + KERNEL_PGD_BOUNDARY,
> + KERNEL_PGD_PTRS);
> +
> + /*
> + * sync back low identity map too. It is used for example
> + * in the 32-bit EFI stub.
> + */
> + clone_pgd_range(initial_page_table,
> + swapper_pg_dir + KERNEL_PGD_BOUNDARY,
> + min(KERNEL_PGD_PTRS, KERNEL_PGD_BOUNDARY));
> +#endif
> }
>

This breaks the boot on our Intel Quark platform (IOT2000, similar to
Galileo Gen2). Reverting it over master makes it work again. Any idea
what goes wrong? Let me know how I can help debugging this.

Jan

--
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux