Re: [PATCH next] io_uring: fix task hung in io_uring_setup

From: Stefano Garzarella
Date: Mon Sep 07 2020 - 04:43:39 EST


On Thu, Sep 03, 2020 at 09:21:19PM +0800, Hillf Danton wrote:
>
> The smart syzbot found the following issue:
>
> INFO: task syz-executor047:6853 blocked for more than 143 seconds.
> Not tainted 5.9.0-rc3-next-20200902-syzkaller #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:syz-executor047 state:D stack:28104 pid: 6853 ppid: 6847 flags:0x00004000
> Call Trace:
> context_switch kernel/sched/core.c:3777 [inline]
> __schedule+0xea9/0x2230 kernel/sched/core.c:4526
> schedule+0xd0/0x2a0 kernel/sched/core.c:4601
> schedule_timeout+0x1d8/0x250 kernel/time/timer.c:1855
> do_wait_for_common kernel/sched/completion.c:85 [inline]
> __wait_for_common kernel/sched/completion.c:106 [inline]
> wait_for_common kernel/sched/completion.c:117 [inline]
> wait_for_completion+0x163/0x260 kernel/sched/completion.c:138
> io_sq_thread_stop fs/io_uring.c:6906 [inline]
> io_finish_async fs/io_uring.c:6920 [inline]
> io_sq_offload_create fs/io_uring.c:7595 [inline]
> io_uring_create fs/io_uring.c:8671 [inline]
> io_uring_setup+0x1495/0x29a0 fs/io_uring.c:8744
> do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> because the sqo_thread kthread is created in io_sq_offload_create() without
> being waked up. Then in the error branch of that function we will wait for
> the sqo kthread that never runs. It's fixed by waking it up before waiting.
>
> Reported-by: syzbot+107dd59d1efcaf3ffca4@xxxxxxxxxxxxxxxxxxxxxxxxx
> Fixes: dfe127799f8e ("io_uring: allow disabling rings during the creation")
> Cc: Stefano Garzarella <sgarzare@xxxxxxxxxx>
> Cc: Kees Cook <keescook@xxxxxxxxxxxx>
> Signed-off-by: Hillf Danton <hdanton@xxxxxxxx>
> ---
>
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -6903,6 +6903,13 @@ static int io_sqe_files_unregister(struc
> static void io_sq_thread_stop(struct io_ring_ctx *ctx)
> {
> if (ctx->sqo_thread) {
> + /*
> + * We may arrive here from the error branch in
> + * io_sq_offload_create() where the kthread is created without
> + * being waked up, thus wake it up now to make sure the wait will
> + * complete.
> + */
> + wake_up_process(ctx->sqo_thread);
> wait_for_completion(&ctx->sq_thread_comp);
> /*
> * The park is a bit of a work-around, without it we get
> --
>

Thanks for fixing this issue!
Jens already queued this, but just for recording:

Reviewed-by: Stefano Garzarella <sgarzare@xxxxxxxxxx>


Thanks,
Stefano