INFO: rcu_preempt detected expedited stalls on CPUs/tasks (6.1.0-rc3): in cat /sys/kernel/debug/kmemleak

From: Mirsad Goran Todorovac
Date: Fri Nov 04 2022 - 18:02:11 EST

Next message: Antonio Gomes: "[PATCH v3] drm/nouveau: Add support to control backlight using bl_power for nva3."
Previous message: Sean Christopherson: "[PATCH 2/2] x86/mm: Populate KASAN shadow for per-CPU DS buffers in CPU entry area"
Next in thread: srinivas pandruvada: "Re: INFO: rcu_preempt detected expedited stalls on CPUs/tasks (6.1.0-rc3): in cat /sys/kernel/debug/kmemleak"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Dear all,

When investigating thermald kmemleak, it occurred that the "cat /sys/kernel/debug/kmemleak"
and "tail -20 /sys/kernel/debug/kmemleak" commands take unusual amount of time.

Dmesg output showed expedited stalls that the commands caused NMIs and NMI backtraces:

[ 8123.263464] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 26 jiffies s: 3829 root: 0x1/.
[ 8123.263500] rcu: blocking rcu_node structures (internal RCU debug):
[ 8123.263508] Sending NMI from CPU 7 to CPUs 0:
[ 8123.263528] NMI backtrace for cpu 0
[ 8123.263539] CPU: 0 PID: 27898 Comm: cat Not tainted 6.1.0-rc3 #1
[ 8123.263552] Hardware name: LENOVO 82H8/LNVNB161216, BIOS GGCN34WW 03/08/2022
[ 8123.263557] RIP: 0010:kmemleak_seq_start+0x41/0x80
[ 8123.263579] Code: 55 04 a6 00 4c 63 e0 85 c0 78 40 e8 b9 80 db ff 48 8b 05 92 fb 88 01 4c 8d 60 f8 48 3d 30
                                    62 63 92 75 17 eb 32 49 8b 44 24 08 <48> 83 eb 01 4c 8d 60 f8 48 3d 30 62 63 92 74 1d
                                    48 85 db 7f e6 4c
[ 8123.263588] RSP: 0018:ffff9968e400fc30 EFLAGS: 00000206
[ 8123.263598] RAX: ffff8963b7005b58 RBX: 0000000000011cb8 RCX: 0000000000000001
[ 8123.263604] RDX: ffff8963b09c4000 RSI: ffff8963856de028 RDI: ffffffff926361c0
[ 8123.263608] RBP: ffff9968e400fc40 R08: 0000000000020000 R09: ffff896380592a80
[ 8123.263613] R10: 0000000000020000 R11: 0000000000000000 R12: ffff8964114c2390
[ 8123.263617] R13: ffff89639fa25b00 R14: ffff8963856de000 R15: ffff9968e400fe30
[ 8123.263622] FS: 00007ff217c15740(0000) GS:ffff896528800000(0000) knlGS:0000000000000000
[ 8123.263630] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8123.263636] CR2: 00007f19bd6a8000 CR3: 000000028e392001 CR4: 0000000000770ef0
[ 8123.263642] PKRU: 55555554
[ 8123.263646] Call Trace:
[ 8123.263649] <TASK>
[ 8123.263656] seq_read_iter+0x169/0x420
[ 8123.263671] seq_read+0xad/0xe0
[ 8123.263685] full_proxy_read+0x59/0x90
[ 8123.263701] vfs_read+0xb2/0x2e0
[ 8123.263718] ksys_read+0x61/0xe0
[ 8123.263730] __x64_sys_read+0x1a/0x20
[ 8123.263741] do_syscall_64+0x58/0x80
[ 8123.263754] ? do_syscall_64+0x67/0x80
[ 8123.263767] ? exit_to_user_mode_prepare+0x15d/0x190
[ 8123.263781] ? syscall_exit_to_user_mode+0x1b/0x30
[ 8123.263791] ? do_syscall_64+0x67/0x80
[ 8123.263804] ? syscall_exit_to_user_mode+0x1b/0x30
[ 8123.263813] ? do_syscall_64+0x67/0x80
[ 8123.263823] ? do_syscall_64+0x67/0x80
[ 8123.263833] ? do_syscall_64+0x67/0x80
[ 8123.263844] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 8123.263857] RIP: 0033:0x7ff217914992
[ 8123.263867] Code: c0 e9 b2 fe ff ff 50 48 8d 3d fa b2 0c 00 e8 c5 1d 02 00 0f 1f 44 00 00 f3 0f
                                    1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56
                                    c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
[ 8123.263875] RSP: 002b:00007ffdfcbc3e28 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 8123.263894] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007ff217914992
[ 8123.263899] RDX: 0000000000020000 RSI: 00007ff217b9d000 RDI: 0000000000000003
[ 8123.263904] RBP: 00007ff217b9d000 R08: 00007ff217b9c010 R09: 00007ff217b9c010
[ 8123.263909] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000022000
[ 8123.263914] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
[ 8123.263927] </TASK>

To reproduce:

Enable CONFIG_DEBUG_KMEMLEAK=y in linux_stable 6.1.0-rc3 build.

Then, a stress test on a service known from the previous report is required:

for a in {1..1000}; doRIP: 0010:kmemleak_seq_start+0x41/0x80
         echo $a
         systemctl stop thermald
         sleep 0.5
         systemctl start thermald
         sleep 0.5
done

After that, /sys/kernel/debug/kmemleak indicated 1413 unreferenced objects.

However, dmesg had shown the stalls on CPUs while executing "cat" or "tail -40" of /sys/kernel/debug/kmemleak.

I've read the https://www.kernel.org/doc/Documentation/RCU/stallwarn.txt, but I was unable to understand
what is going on except to locate the stall to "RIP: 0010:kmemleak_seq_start+0x41/0x80" and probable
maintainer/developer.

Please find attached the kernel build config, the output of "cat /sys/kernel/debug/kmemleak" and
dmesg ouput about the expedited stalls. The number of jiffies is not so large, but IMHO they possibly indicate
greater problems or deadlocks in RCU code in kmemleak.

NOTE:
Please add:

Reported-By: Mirsad Goran Todorovac <mirsad.todorovac@xxxxxxxxxxxx>

if (or when) the bug is fixed.

Thank you.

-mt

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union

Attachment: config-6.1.0-rc3.gz
Description: application/gzip

Attachment: expedited_stalls_cat_tail_kmemleak.dmesg.gz
Description: application/gzip

Attachment: kmemleak-thermald.txt.gz
Description: application/gzip

Next message: Antonio Gomes: "[PATCH v3] drm/nouveau: Add support to control backlight using bl_power for nva3."
Previous message: Sean Christopherson: "[PATCH 2/2] x86/mm: Populate KASAN shadow for per-CPU DS buffers in CPU entry area"
Next in thread: srinivas pandruvada: "Re: INFO: rcu_preempt detected expedited stalls on CPUs/tasks (6.1.0-rc3): in cat /sys/kernel/debug/kmemleak"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]