Re: [syzbot] [gfs2?] kernel panic: hung_task: blocked tasks (2)

From: Bob Peterson
Date: Fri Jul 28 2023 - 07:49:37 EST


On 7/28/23 3:20 AM, David Howells wrote:
syzbot <syzbot+607aa822c60b2e75b269@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:

Fixes: 9c8ad7a2ff0b ("uapi, x86: Fix the syscall numbering of the mount API syscalls [ver #2]")

This would seem unlikely to be the culprit. It just changes the numbering on
the fsconfig-related syscalls.

Running the test program on v6.5-rc3, however, I end up with the test process
stuck in the D state:

INFO: task repro-17687f1aa:5551 blocked for more than 120 seconds.
Not tainted 6.5.0-rc3-build3+ #1448
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:repro-17687f1aa state:D stack:0 pid:5551 ppid:5516 flags:0x00004002
Call Trace:
<TASK>
__schedule+0x4a7/0x4f1
schedule+0x66/0xa1
schedule_timeout+0x9d/0xd7
? __next_timer_interrupt+0xf6/0xf6
gfs2_gl_hash_clear+0xa0/0xdc
? sugov_irq_work+0x15/0x15
gfs2_put_super+0x19f/0x1d3
generic_shutdown_super+0x78/0x187
kill_block_super+0x1c/0x32
deactivate_locked_super+0x2f/0x61
cleanup_mnt+0xab/0xcc
task_work_run+0x6b/0x80
exit_to_user_mode_prepare+0x76/0xfd
syscall_exit_to_user_mode+0x14/0x31
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f89aac31dab
RSP: 002b:00007fff43d9b878 EFLAGS: 00000206 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 00007fff43d9cad8 RCX: 00007f89aac31dab
RDX: 0000000000000000 RSI: 000000000000000a RDI: 00007fff43d9b920
RBP: 00007fff43d9c960 R08: 0000000000000000 R09: 0000000000000073
R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
R13: 00007fff43d9cae8 R14: 0000000000417e18 R15: 00007f89aad51000
</TASK>

David

Hi David,

This indicates gfs2 is having trouble resolving and freeing all its glocks, which usually means a reference counting problem or ail (active items list) problem during unmount.

If gfs2_gl_hash_clear gets stuck for a long period of time it is supposed to dump the remaining list of glocks that still have not been resolved. I think it takes 10 minutes or so. Can you post the console messages that follow? That will help us figure out what's happening. Thanks.

Regards,

Bob Peterson