Re: INFO: task hung in grab_super

From: Dominique Martinet
Date: Fri Nov 02 2018 - 18:45:58 EST

Next message: kernelci.org bot: "Re: [PATCH 4.18 000/150] 4.18.17-stable review"
Previous message: Petr Vorel: "Re: [PATCH] kconfig: merge_config: avoid false positive matches from comment lines"
In reply to: Dmitry Vyukov: "Re: INFO: task hung in grab_super"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Dmitry Vyukov wrote on Fri, Nov 02, 2018:
> >> I guess that's the problem, right? SIGKILL-ed task must not ignore
> >> SIGKILL and hang in infinite loop. This would explain a bunch of hangs
> >> in 9p.
> >
> > Did you check /proc/18253/task/*/stack after manually sending SIGKILL?
>
> Yes:
>
> root@syzkaller:~# ps afxu | grep syz
> root 18253 0.0 0.0 0 0 ttyS0 Zl 10:16 0:00 \_
> [syz-executor] <defunct>
> root@syzkaller:~# cat /proc/18253/task/*/stack
> [<0>] p9_client_rpc+0x3a2/0x1400
> [<0>] p9_client_flush+0x134/0x2a0
> [<0>] p9_client_rpc+0x122c/0x1400
> [<0>] p9_client_create+0xc56/0x16af
> [<0>] v9fs_session_init+0x21a/0x1a80
> [<0>] v9fs_mount+0x7c/0x900
> [<0>] mount_fs+0xae/0x328
> [<0>] vfs_kern_mount.part.34+0xdc/0x4e0
> [<0>] do_mount+0x581/0x30e0
> [<0>] ksys_mount+0x12d/0x140
> [<0>] __x64_sys_mount+0xbe/0x150
> [<0>] do_syscall_64+0x1b9/0x820
> [<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [<0>] 0xffffffffffffffff

Yes that's a known problem with the current code, since everything must
be cleaned up on the spot, the first kill sends a flush and waits again
for the flush reply to come; the second kill is completly ignored.

With the refcounting work we've done that went in this merge window
we're halfways there - memory can now have a lifetime independant of the
current request and won't be freed when the process exits p9_client_rpc,
so we can send the flush and return immediately; then have the rest of
the cleanup happen asynchronously when the flush reply comes or the
client is torn down, whichever happens first.

I've got this planned for 4.21 if I can find the time to do it early in
this cycle and I get it to work on first try, 4.22 if I run into
complications to make sure it's well tested in -next first.
My freetime is pretty limited this year so unless you want to help it'll
get done when it's ready :)

--
Dominique

Next message: kernelci.org bot: "Re: [PATCH 4.18 000/150] 4.18.17-stable review"
Previous message: Petr Vorel: "Re: [PATCH] kconfig: merge_config: avoid false positive matches from comment lines"
In reply to: Dmitry Vyukov: "Re: INFO: task hung in grab_super"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]