WARNING: suspicious RCU usage - while installing a VM on a CPU listed under nohz_full

From: Nitesh Narayan Lal
Date: Tue Jul 28 2020 - 10:40:08 EST


Hi,

I have recently come across an RCU trace with the 5.8-rc7 kernel that has the
debug configs enabled while installing a VM on a CPU that is listed under
nohz_full.

Based on some of the initial debugging, my impression is that the issue is
triggered because of the fastpath that is meant to optimize the writes to x2APIC
ICR that eventually leads to a virtual IPI in fixed delivery mode, is getting
invoked from the quiescent state.

Following is the RCU trace dump that I was getting:

[Â 178.109535] =============================
[Â 178.114027] WARNING: suspicious RCU usage
[Â 178.118518] 5.8.0-rc7-upstream-+ #10 Not tainted
[Â 178.123685] -----------------------------
[Â 178.128176] arch/x86/kvm/lapic.c:269 suspicious rcu_dereference_check() usage!
[Â 178.136254]
[Â 178.136254] other info that might help us debug this:
[Â 178.136254]
[Â 178.145205]
[Â 178.145205] rcu_scheduler_active = 2, debug_locks = 1
[Â 178.152506] 1 lock held by CPU 0/KVM/2959:
[Â 178.157091]Â #0: ffffc9000717b6f8 (&kvm->arch.apic_map_lock){+.+.}-{3:3}, at:
kvm_recalculate_apic_map+0x8b/0xdd0 [kvm]
[Â 178.169207]
[Â 178.169207] stack backtrace:
[Â 178.174086] CPU: 18 PID: 2959 Comm: CPU 0/KVM Not tainted
5.8.0-rc7-upstream-+ #10
[Â 178.182539] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.6.0 10/31/2017
[Â 178.190895] Call Trace:
[Â 178.193637]Â dump_stack+0x9d/0xe0
[Â 178.197379]Â kvm_recalculate_apic_map+0x8ce/0xdd0 [kvm]
[Â 178.203259]Â ? kvm_lapic_reset+0x832/0xe50 [kvm]
[Â 178.208459]Â kvm_vcpu_reset+0x28/0x7b0 [kvm]
[Â 178.213270]Â kvm_arch_vcpu_create+0x830/0xb70 [kvm]
[Â 178.218759]Â kvm_vm_ioctl+0x11b1/0x1fe0 [kvm]
[Â 178.223635]Â ? mark_lock+0x144/0x19e0
[Â 178.227757]Â ? kvm_unregister_device_ops+0xe0/0xe0 [kvm]
[Â 178.233698]Â ? sched_clock+0x5/0x10
[Â 178.237597]Â ? sched_clock_cpu+0x18/0x1d0
[Â 178.242087]Â ? __lock_acquire+0xcf6/0x5010
[Â 178.246686]Â ? lockdep_hardirqs_on_prepare+0x550/0x550
[Â 178.252429]Â ? lockdep_hardirqs_on_prepare+0x550/0x550
[Â 178.258177]Â ? sched_clock+0x5/0x10
[Â 178.262074]Â ? sched_clock_cpu+0x18/0x1d0
[Â 178.266556]Â ? find_held_lock+0x3a/0x1c0
[Â 178.270953]Â ? ioctl_file_clone+0x120/0x120
[Â 178.275630]Â ? selinux_file_ioctl+0x98/0x570
[Â 178.280405]Â ? selinux_file_mprotect+0x5b0/0x5b0
[Â 178.285569]Â ? rcu_tasks_wait_gp+0x6d1/0xa50
[Â 178.290342]Â ? rcu_read_lock_sched_held+0xe0/0xe0
[Â 178.295608]Â ? __fget_files+0x1f0/0x300
[Â 178.299912]Â ksys_ioctl+0xc0/0x110
[Â 178.303719]Â __x64_sys_ioctl+0x6f/0xb0
[Â 178.307913]Â do_syscall_64+0x51/0xb0
[Â 178.311913]Â entry_SYSCALL_64_after_hwframe+0x44/0xa9
[Â 178.317557] RIP: 0033:0x7f6b9700d88b
[Â 178.321551] Code: 0f 1e fa 48 8b 05 fd 95 2c 00 64 c7 00 26 00 00 00 48 c7 c0
ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0
ff ff 73 018
[Â 178.342513] RSP: 002b:00007f6b8cbe9668 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[Â 178.350967] RAX: ffffffffffffffda RBX: 000055e8162d9000 RCX: 00007f6b9700d88b
[Â 178.358935] RDX: 0000000000000000 RSI: 000000000000ae41 RDI: 000000000000000e
[Â 178.366903] RBP: 000055e8162d9000 R08: 000055e8155ec4d0 R09: 000055e8162d9000
[Â 178.374871] R10: 000055e815d94ee0 R11: 0000000000000246 R12: 000055e8162ad420
[Â 178.382838] R13: 000055e8162d9000 R14: 00007ffedf043660 R15: 00007f6b8cbe9800
[Â 182.771858]
[Â 182.773606] =============================
[Â 182.778084] WARNING: suspicious RCU usage
[Â 182.782564] 5.8.0-rc7-upstream-+ #10 Not tainted
[Â 182.787719] -----------------------------
[Â 182.792197] arch/x86/include/asm/trace/fpu.h:60 suspicious
rcu_dereference_check() usage!
[Â 182.801329]
[Â 182.801329] other info that might help us debug this:
[Â 182.801329]
[Â 182.810268]
[Â 182.810268] RCU used illegally from idle CPU!
[Â 182.810268] rcu_scheduler_active = 2, debug_locks = 1
[Â 182.822407] RCU used illegally from extended quiescent state!
[Â 182.828824] 1 lock held by CPU 0/KVM/2959:
[Â 182.833397]Â #0: ffff88903f8500d0 (&vcpu->mutex){+.+.}-{3:3}, at:
kvm_vcpu_ioctl+0x172/0xb00 [kvm]
[Â 182.838308]
[Â 182.838308] stack backtrace:
[Â 182.838313] CPU: 18 PID: 2959 Comm: CPU 0/KVM Tainted: GÂÂÂÂÂÂÂ WÂÂÂÂÂÂÂÂ
5.8.0-rc7-upstream-+ #10
[Â 182.838316] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.6.0 10/31/2017
[Â 182.838318] Call Trace:
[Â 182.874318]Â dump_stack+0x9d/0xe0
[Â 182.878024]Â switch_fpu_return+0x37c/0x410
[Â 182.882602]Â ? fpu__clear+0x1a0/0x1a0
[Â 182.886700]Â ? rcu_dynticks_eqs_enter+0x15/0x30
[Â 182.891807]Â vcpu_enter_guest+0x1854/0x3df0 [kvm]
[Â 182.897121]Â ? kvm_vcpu_reload_apic_access_page+0x60/0x60 [kvm]
[Â 182.903738]Â ? lock_acquire+0x1ac/0xac0
[Â 182.908062]Â ? kvm_arch_vcpu_ioctl_run+0x1dc/0x13c0 [kvm]
[Â 182.914107]Â ? rcu_read_unlock+0x50/0x50
[Â 182.918489]Â ? rcu_read_lock_sched_held+0xaf/0xe0
[Â 182.923788]Â ? kvm_load_guest_fpu+0x94/0x350 [kvm]
[Â 182.929177]Â ? kvm_load_guest_fpu+0x94/0x350 [kvm]
[Â 182.934528]Â ? __local_bh_enable_ip+0x123/0x1a0
[Â 182.939635]Â kvm_arch_vcpu_ioctl_run+0x310/0x13c0 [kvm]
[Â 182.945529]Â kvm_vcpu_ioctl+0x3ee/0xb00 [kvm]
[Â 182.950406]Â ? sched_clock+0x5/0x10
[Â 182.954336]Â ? kvm_set_memory_region+0x40/0x40 [kvm]
[Â 182.959897]Â ? ioctl_file_clone+0x120/0x120
[Â 182.964572]Â ? selinux_file_ioctl+0x98/0x570
[Â 182.969347]Â ? selinux_file_mprotect+0x5b0/0x5b0
[Â 182.974507]Â ? rcu_tasks_wait_gp+0x710/0xa50
[Â 182.979280]Â ? rcu_read_lock_sched_held+0xe0/0xe0
[Â 182.984547]Â ? __fget_files+0x1f0/0x300
[Â 182.988852]Â ksys_ioctl+0xc0/0x110
[Â 182.992660]Â __x64_sys_ioctl+0x6f/0xb0
[Â 182.996853]Â do_syscall_64+0x51/0xb0
[Â 183.000849]Â entry_SYSCALL_64_after_hwframe+0x44/0xa9
[Â 183.006491] RIP: 0033:0x7f6b9700d88b
[Â 183.010486] Code: 0f 1e fa 48 8b 05 fd 95 2c 00 64 c7 00 26 00 00 00 48 c7 c0
ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0
ff ff 73 018
[Â 183.031446] RSP: 002b:00007f6b8cbe9618 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[Â 183.039901] RAX: ffffffffffffffda RBX: 000055e8162d9000 RCX: 00007f6b9700d88b
[Â 183.047868] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000018
[Â 183.055826] RBP: 000055e8162d909b R08: 000055e8155ec1d0 R09: 000055e815db1620
[Â 183.063784] R10: 0000000000000000 R11: 0000000000000246 R12: 000055e8151f6290
[Â 183.071752] R13: 000055e8155c81c0 R14: 00007ffedf043660 R15: 00007f6b9c163000

Please let me know if any other information is required.

--
Nitesh

Attachment: signature.asc
Description: OpenPGP digital signature