Re: suspicious RCU due to "Prefer using an idle CPU as a migration target instead of comparing tasks"

From: Qian Cai
Date: Thu Feb 27 2020 - 09:47:51 EST


On Thu, 2020-02-27 at 09:09 -0500, Qian Cai wrote:
> The linux-next commit ff7db0bf24db ("sched/numa: Prefer using an idle CPU as a
> migration target instead of comparing tasks") introduced a boot warning,

This?

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a61d83ea2930..ca780cd1eae2 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1607,7 +1607,9 @@ static void update_numa_stats(struct task_numa_env *env,
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂif (ns->idle_cpu == -1)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂns->idle_cpu = cpu;
Â
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂrcu_read_lock();
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂidle_core = numa_idle_core(idle_core, cpu);
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂrcu_read_unlock();
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ}
ÂÂÂÂÂÂÂÂ}

>
> [ÂÂÂ86.520534][ÂÂÂÂT1] WARNING: suspicious RCU usage
> [ÂÂÂ86.520540][ÂÂÂÂT1] 5.6.0-rc3-next-20200227 #7 Not tainted
> [ÂÂÂ86.520545][ÂÂÂÂT1] -----------------------------
> [ÂÂÂ86.520551][ÂÂÂÂT1] kernel/sched/fair.c:5914 suspicious
> rcu_dereference_check() usage!
> [ÂÂÂ86.520555][ÂÂÂÂT1]Â
> [ÂÂÂ86.520555][ÂÂÂÂT1] other info that might help us debug this:
> [ÂÂÂ86.520555][ÂÂÂÂT1]Â
> [ÂÂÂ86.520561][ÂÂÂÂT1]Â
> [ÂÂÂ86.520561][ÂÂÂÂT1] rcu_scheduler_active = 2, debug_locks = 1
> [ÂÂÂ86.520567][ÂÂÂÂT1] 1 lock held by systemd/1:
> [ÂÂÂ86.520571][ÂÂÂÂT1]ÂÂ#0: ffff8887f4b14848 (&mm->mmap_sem#2){++++}, at:
> do_page_fault+0x1d2/0x998
> [ÂÂÂ86.520594][ÂÂÂÂT1]Â
> [ÂÂÂ86.520594][ÂÂÂÂT1] stack backtrace:
> [ÂÂÂ86.520602][ÂÂÂÂT1] CPU: 1 PID: 1 Comm: systemd Not tainted 5.6.0-rc3-next-
> 20200227 #7
> [ÂÂÂ86.520607][ÂÂÂÂT1] Hardware name: HP ProLiant XL450 Gen9 Server/ProLiant
> XL450 Gen9 Server, BIOS U21 05/05/2016
> [ÂÂÂ86.520612][ÂÂÂÂT1] Call Trace:
> [ÂÂÂ86.520623][ÂÂÂÂT1]ÂÂdump_stack+0xa0/0xea
> [ÂÂÂ86.520634][ÂÂÂÂT1]ÂÂlockdep_rcu_suspicious+0x102/0x10b
> lockdep_rcu_suspicious at kernel/locking/lockdep.c:5648
> [ÂÂÂ86.520641][ÂÂÂÂT1]ÂÂupdate_numa_stats+0x577/0x710
> test_idle_cores at kernel/sched/fair.c:5914
> (inlined by) numa_idle_core at kernel/sched/fair.c:1565
> (inlined by) update_numa_stats at kernel/sched/fair.c:1610
> [ÂÂÂ86.520649][ÂÂÂÂT1]ÂÂ? rcu_read_lock_held+0xac/0xc0
> [ÂÂÂ86.520657][ÂÂÂÂT1]ÂÂtask_numa_migrate+0x4aa/0xdb0
> [ÂÂÂ86.520664][ÂÂÂÂT1]ÂÂ? task_numa_find_cpu+0x1010/0x1010
> [ÂÂÂ86.520677][ÂÂÂÂT1]ÂÂ? migrate_pages+0x29c/0x17c0
> [ÂÂÂ86.520683][ÂÂÂÂT1]ÂÂtask_numa_fault+0x607/0xd90
> [ÂÂÂ86.520691][ÂÂÂÂT1]ÂÂ? task_numa_free+0x230/0x230
> [ÂÂÂ86.520698][ÂÂÂÂT1]ÂÂ? __kasan_check_read+0x11/0x20
> [ÂÂÂ86.520704][ÂÂÂÂT1]ÂÂ? do_raw_spin_unlock+0xa8/0x140
> [ÂÂÂ86.520712][ÂÂÂÂT1]ÂÂdo_numa_page+0x33f/0x450
> [ÂÂÂ86.520720][ÂÂÂÂT1]ÂÂ__handle_mm_fault+0xb81/0xb90
> [ÂÂÂ86.520727][ÂÂÂÂT1]ÂÂ? copy_page_range+0x420/0x420
> [ÂÂÂ86.520736][ÂÂÂÂT1]ÂÂhandle_mm_fault+0xdc/0x2e0
> [ÂÂÂ86.520742][ÂÂÂÂT1]ÂÂdo_page_fault+0x2c7/0x998
> [ÂÂÂ86.520752][ÂÂÂÂT1]ÂÂpage_fault+0x34/0x40
> [ÂÂÂ86.520758][ÂÂÂÂT1] RIP: 0033:0x7f95faf63c53
> [ÂÂÂ86.520766][ÂÂÂÂT1] Code: 00 41 00 3d 00 00 41 00 74 3d 48 8d 05 d6 5a 2d 00
> 8b 00 85 c0 75 61 b8 01 01 00 00 0f 05 48 3d 00 f0 ff ff 0f 87 a5 00 00 00 <48>
> 8b 4c 24 38 64 48 33 0c 25 28 00 00 00 0f 85 ba 00 00 00 48 83
> [ÂÂÂ86.520771][ÂÂÂÂT1] RSP: 002b:00007ffdda737790 EFLAGS: 00010207
> [ÂÂÂ86.520778][ÂÂÂÂT1] RAX: 0000000000000024 RBX: 0000562a594b9fd0 RCX:
> 00007f95faf63c47
> [ÂÂÂ86.520783][ÂÂÂÂT1] RDX: 00000000002a0000 RSI: 0000562a594b9fd1 RDI:
> 0000000000000023
> [ÂÂÂ86.520788][ÂÂÂÂT1] RBP: 00007ffdda7379c0 R08: 00007f95fc734e30 R09:
> 00007ffdda737d60
> [ÂÂÂ86.520793][ÂÂÂÂT1] R10: 0000000000000000 R11: 0000000000000246 R12:
> 0000562a59459fb4
> [ÂÂÂ86.520798][ÂÂÂÂT1] R13: 0000000000000000 R14: 0000000000000001 R15:
> 0000000000000000