Re: [PATCH RT] Revert "cpumask: Disable CONFIG_CPUMASK_OFFSTACK for RT"

From: Scott Wood
Date: Sun Jan 19 2020 - 03:03:39 EST


On Wed, 2019-12-18 at 18:41 +0100, Sebastian Andrzej Siewior wrote:
> The one x86 case we had was fixed in commit
> 832df3d47badc ("x86/smp: Enhance native_send_call_func_ipi()")
>
> I didn't find another in-IRQ user. Most callers use GFP_KERNEL and the
> ATOMIC users are allocating the mask while holding a spinlock_t.
>
> Allow to use CPUMASK_OFFSTACK becauase it no longer is a problem on RT.
>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
> ---
> arch/x86/Kconfig | 2 +-
> lib/Kconfig | 1 -
> 2 files changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 7f359aacf8148..4b77b3273051e 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -967,7 +967,7 @@ config CALGARY_IOMMU_ENABLED_BY_DEFAULT
> config MAXSMP
> bool "Enable Maximum number of SMP Processors and NUMA Nodes"
> depends on X86_64 && SMP && DEBUG_KERNEL
> - select CPUMASK_OFFSTACK if !PREEMPT_RT
> + select CPUMASK_OFFSTACK

I get splats with this due to zalloc_cpumask_var() with preemption disabled
(from the get_cpu() in x86 flush_tlb_mm_range()):


[ 26.576878] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: systemd
[ 26.576879] 3 locks held by systemd/1:
[ 26.576881] #0: ffff8897d66a1258 (&mm->mmap_sem#2){++++}, at: __do_munmap+0x430/0x470
[ 26.576891] #1: ffffffff824be3a0 (rcu_read_lock){....}, at: rt_spin_lock+0x5/0xd0
[ 26.576899] #2: ffff8897e04e42e0 ((pa_lock).lock){+.+.}, at: get_page_from_freelist+0x2eb/0x1860
[ 26.576906] Preemption disabled at:
[ 26.576906] [<ffffffff81068b5d>] flush_tlb_mm_range+0x2d/0x1b0
[ 26.576911] CPU: 11 PID: 1 Comm: systemd Tainted: G E 5.4.10-rt5.dbg+ #224
[ 26.576913] Hardware name: Intel Corporation S2600BT/S2600BT, BIOS SE5C620.86B.01.00.0763.022420181017 02/24/2018
[ 26.576914] Call Trace:
[ 26.576916] dump_stack+0x68/0x9b
[ 26.576921] ___might_sleep+0x191/0x1f0
[ 26.576929] rt_spin_lock+0x93/0xd0
[ 26.576931] ? get_page_from_freelist+0x2eb/0x1860
[ 26.576933] get_page_from_freelist+0x2eb/0x1860
[ 26.576939] ? lockdep_hardirqs_on+0xf0/0x1a0
[ 26.576944] ? _raw_spin_unlock_irqrestore+0x3e/0x90
[ 26.576945] ? find_held_lock+0x2d/0x90
[ 26.576954] __alloc_pages_nodemask+0x186/0x440
[ 26.576960] new_slab+0x357/0xc20
[ 26.576971] ___slab_alloc+0x52f/0x7a0
[ 26.576977] ? alloc_cpumask_var_node+0x1f/0x30
[ 26.576981] ? __lock_acquire+0x24b/0x1080
[ 26.576985] ? alloc_cpumask_var_node+0x1f/0x30
[ 26.576987] __slab_alloc.isra.89+0x6a/0xb4
[ 26.576992] ? alloc_cpumask_var_node+0x1f/0x30
[ 26.576993] __kmalloc_node+0xd9/0x410
[ 26.576997] ? x86_configure_nx+0x50/0x50
[ 26.576998] ? flush_tlb_func_common.isra.8+0x530/0x530
[ 26.576999] alloc_cpumask_var_node+0x1f/0x30
[ 26.577001] on_each_cpu_cond_mask+0x55/0x1e0
[ 26.577006] ? rcu_read_lock_sched_held+0x52/0x80
[ 26.577013] flush_tlb_mm_range+0x13b/0x1b0
[ 26.577018] tlb_flush_mmu+0x89/0x170
[ 26.577020] tlb_finish_mmu+0x3d/0x70
[ 26.577022] unmap_region+0xd9/0x120
[ 26.577028] ? rt_mutex_slowunlock+0x65/0x70
[ 26.577034] __do_munmap+0x26c/0x470
[ 26.577039] __vm_munmap+0x72/0xc0
[ 26.577044] __x64_sys_munmap+0x27/0x30
[ 26.577046] do_syscall_64+0x6c/0x280
[ 26.577050] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 26.577053] RIP: 0033:0x7f33a23243f7
[ 26.577054] Code: 64 89 02 48 83 c8 ff eb 9c 48 8b 15 93 da 2c 00 f7 d8 64 89 02 e9 6a ff ff ff 66 0f 1f 84 00 00 00 00 00 b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 69 da 2c 00 f7 d8 64 89 01 48
[ 26.577056] RSP: 002b:00007fffc45a8228 EFLAGS: 00000206 ORIG_RAX: 000000000000000b
[ 26.577056] RAX: ffffffffffffffda RBX: 000055bce7ca7790 RCX: 00007f33a23243f7
[ 26.577057] RDX: 0000000000000000 RSI: 0000000000001000 RDI: 00007f33a3b0f000
[ 26.577058] RBP: 0000000000000000 R08: 000055bce7ca7870 R09: 00007f33a3afc940
[ 26.577058] R10: 0000000000000006 R11: 0000000000000206 R12: 0000000000000000
[ 26.577059] R13: 0000000000000000 R14: 000055bce7ca6e70 R15: 000055bce7c9cea0

-Scott