Re: [tip:x86/urgent] x86/mm/pat: Disable preemption around __flush_tlb_all()

From: Dan Williams
Date: Mon Nov 05 2018 - 16:56:34 EST


On Mon, Oct 29, 2018 at 11:12 AM tip-bot for Sebastian Andrzej Siewior
<tipbot@xxxxxxxxx> wrote:
>
> Commit-ID: f77084d96355f5fba8e2c1fb3a51a393b1570de7
> Gitweb: https://git.kernel.org/tip/f77084d96355f5fba8e2c1fb3a51a393b1570de7
> Author: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
> AuthorDate: Wed, 17 Oct 2018 12:34:32 +0200
> Committer: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> CommitDate: Mon, 29 Oct 2018 19:04:31 +0100
>
> x86/mm/pat: Disable preemption around __flush_tlb_all()
>
> The WARN_ON_ONCE(__read_cr3() != build_cr3()) in switch_mm_irqs_off()
> triggers every once in a while during a snapshotted system upgrade.
>
> The warning triggers since commit decab0888e6e ("x86/mm: Remove
> preempt_disable/enable() from __native_flush_tlb()"). The callchain is:
>
> get_page_from_freelist() -> post_alloc_hook() -> __kernel_map_pages()
>
> with CONFIG_DEBUG_PAGEALLOC enabled.
>
> Disable preemption during CR3 reset / __flush_tlb_all() and add a comment
> why preemption has to be disabled so it won't be removed accidentaly.
>
> Add another preemptible() check in __flush_tlb_all() to catch callers with
> enabled preemption when PGE is enabled, because PGE enabled does not
> trigger the warning in __native_flush_tlb(). Suggested by Andy Lutomirski.
>
> Fixes: decab0888e6e ("x86/mm: Remove preempt_disable/enable() from __native_flush_tlb()")
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Andy Lutomirski <luto@xxxxxxxxxx>
> Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Borislav Petkov <bp@xxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> Link: https://lkml.kernel.org/r/20181017103432.zgv46nlu3hc7k4rq@xxxxxxxxxxxxx
> ---
> arch/x86/include/asm/tlbflush.h | 6 ++++++
> arch/x86/mm/pageattr.c | 6 +++++-
> 2 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
> index 323a313947e0..d760611cfc35 100644
> --- a/arch/x86/include/asm/tlbflush.h
> +++ b/arch/x86/include/asm/tlbflush.h
> @@ -453,6 +453,12 @@ static inline void __native_flush_tlb_one_user(unsigned long addr)
> */
> static inline void __flush_tlb_all(void)
> {
> + /*
> + * This is to catch users with enabled preemption and the PGE feature
> + * and don't trigger the warning in __native_flush_tlb().
> + */
> + VM_WARN_ON_ONCE(preemptible());

This warning triggers 100% of the time for the pmem use case and it
seems it would also trigger for any memory hotplug use case that uses
arch_add_memory().

WARNING: CPU: 35 PID: 911 at ./arch/x86/include/asm/tlbflush.h:460
__flush_tlb_all+0x1b/0x3a
CPU: 35 PID: 911 Comm: systemd-udevd Tainted: G OE
4.20.0-rc1+ #2583
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:__flush_tlb_all+0x1b/0x3a
[..]
Call Trace:
phys_pud_init+0x29c/0x2bb
kernel_physical_mapping_init+0xfc/0x219
init_memory_mapping+0x1a5/0x3b0
arch_add_memory+0x2c/0x50
devm_memremap_pages+0x3aa/0x610
pmem_attach_disk+0x585/0x700 [nd_pmem]

...could we just move the preempt_disable() inside __flush_tlb_all()?