Re: WARNING: suspicious RCU usage - while installing a VM on a CPU listed under nohz_full

From: Nitesh Narayan Lal
Date: Thu Jul 30 2020 - 18:45:19 EST



On 7/29/20 8:34 AM, Nitesh Narayan Lal wrote:
> On 7/28/20 10:38 PM, Wanpeng Li wrote:
>> Hi Niteshï
>> On Wed, 29 Jul 2020 at 09:00, Wanpeng Li <kernellwp@xxxxxxxxx> wrote:
>>> On Tue, 28 Jul 2020 at 22:40, Nitesh Narayan Lal <nitesh@xxxxxxxxxx> wrote:
>>>> Hi,
>>>>
>>>> I have recently come across an RCU trace with the 5.8-rc7 kernel that has the
>>>> debug configs enabled while installing a VM on a CPU that is listed under
>>>> nohz_full.
>>>>
>>>> Based on some of the initial debugging, my impression is that the issue is
>>>> triggered because of the fastpath that is meant to optimize the writes to x2APIC
>>>> ICR that eventually leads to a virtual IPI in fixed delivery mode, is getting
>>>> invoked from the quiescent state.
>> Could you try latest linux-next tree? I guess maybe some patches are
>> pending in linux-next tree, I can't reproduce against linux-next tree.
> Sure, I will try this today.

Hi Wanpeng,

I am not seeing the issue getting reproduced with the linux-next tree.
Although, I am still seeing a Warning stack trace:

[Â 139.220080] RIP: 0010:kvm_arch_vcpu_ioctl_run+0xb57/0x1320 [kvm]
[Â 139.226837] Code: e8 03 0f b6 04 18 84 c0 74 06 0f 8e 4a 03 00 00 41 c6 85 48
31 00 00 00 e9 24 f8 ff ff 4c 89 ef e8 7e ac 02 00 e9 3d f8 ff ff <0f> 0b e9 f2
f8 ff ff 48f
[Â 139.247828] RSP: 0018:ffff8889bc397cb8 EFLAGS: 00010202
[Â 139.253700] RAX: 0000000000000001 RBX: dffffc0000000000 RCX: ffffffffc1fc3bef
[Â 139.261695] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff888f0fa1a8a0
[Â 139.269692] RBP: ffff8889bc397d18 R08: ffffed113786a7d0 R09: ffffed113786a7d0
[Â 139.277686] R10: ffff8889bc353e7f R11: ffffed113786a7cf R12: ffff8889bc35423c
[Â 139.285682] R13: ffff8889bc353e40 R14: ffff8889bc353e6c R15: ffff88897f536000
[Â 139.293678] FS:Â 00007f3d8a71c700(0000) GS:ffff888a3c400000(0000)
knlGS:0000000000000000
[Â 139.302742] CS:Â 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Â 139.309186] CR2: 0000000000000000 CR3: 00000009bc34c004 CR4: 00000000003726e0
[Â 139.317180] Call Trace:
[Â 139.320002]Â kvm_vcpu_ioctl+0x3ee/0xb10 [kvm]
[Â 139.324907]Â ? sched_clock+0x5/0x10
[Â 139.328875]Â ? kvm_io_bus_get_dev+0x1c0/0x1c0 [kvm]
[Â 139.334375]Â ? ioctl_file_clone+0x120/0x120
[Â 139.339079]Â ? selinux_file_ioctl+0x98/0x570
[Â 139.343895]Â ? selinux_file_mprotect+0x5b0/0x5b0
[Â 139.349088]Â ? irq_matrix_assign+0x360/0x430
[Â 139.353904]Â ? rcu_read_lock_sched_held+0xe0/0xe0
[Â 139.359201]Â ? __fget_files+0x1f0/0x300
[Â 139.363532]Â __x64_sys_ioctl+0x128/0x18e
[Â 139.367948]Â do_syscall_64+0x33/0x40
[Â 139.371974]Â entry_SYSCALL_64_after_hwframe+0x44/0xa9
[Â 139.377643] RIP: 0033:0x7f3d98d0a88b

Are you also triggering anything like this in your environment?


>
--
Nitesh

Attachment: signature.asc
Description: OpenPGP digital signature