Re: [syzbot] WARNING: locking bug in umh_complete

From: Peter Zijlstra
Date: Mon Feb 13 2023 - 10:35:27 EST


On Sat, Feb 04, 2023 at 09:48:39AM +0900, Tetsuo Handa wrote:
> Please fold below diff,

Find final version below -- typing hard I suppose..

> provided that wait_for_completion_state(TASK_FREEZABLE)
> does not return when the current thread was frozen. (If
> wait_for_completion_state(TASK_FREEZABLE) returns when the current thread was
> frozen, we will fail to execute the

FREEZABLE should be transparent; that is it will only return when an
actual wakeup was missed, otherwise it will remain asleep.

Specifically, the FROZEN thing relies on wait loops to be resillient
against spurious wakeups. Consider do_wait_for_common(), the action() :=
schedule_timeout() might 'suriously' return after thawing, but it will
re-validate the actual completion condition and go back to sleep if it
hasn't happened yet.

OTOH, if the completeion condition has happened (right before the
completer itself was frozen for example, but after the waiter was
already frozen), then the 'spurious' wakeup on thaw is exactly what was
needed, the completion condition is satisfied and the wait terminated.

---
Subject: freezer,umh: Fix call_usermode_helper_exec() vs SIGKILL
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Fri, 3 Feb 2023 15:31:11 +0100

Tetsuo-San noted that commit f5d39b020809 ("freezer,sched: Rewrite
core freezer logic") broke call_usermodehelper_exec() for the KILLABLE
case.

Specifically it was missed that the second, unconditional,
wait_for_completion() was not optional and ensures the on-stack
completion is unused before going out-of-scope.

Fixes: f5d39b020809 ("freezer,sched: Rewrite core freezer logic")
Reported-by: syzbot+6cd18e123583550cf469@xxxxxxxxxxxxxxxxxxxxxxxxx
Reported-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Debugged-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Link: https://lkml.kernel.org/r/Y90ar35uKQoUrLEK@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
---
kernel/umh.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)

--- a/kernel/umh.c
+++ b/kernel/umh.c
@@ -438,21 +438,27 @@ int call_usermodehelper_exec(struct subp
if (wait == UMH_NO_WAIT) /* task has freed sub_info */
goto unlock;

- if (wait & UMH_KILLABLE)
- state |= TASK_KILLABLE;
-
if (wait & UMH_FREEZABLE)
state |= TASK_FREEZABLE;

- retval = wait_for_completion_state(&done, state);
- if (!retval)
- goto wait_done;
-
if (wait & UMH_KILLABLE) {
+ retval = wait_for_completion_state(&done, state | TASK_KILLABLE);
+ if (!retval)
+ goto wait_done;
+
/* umh_complete() will see NULL and free sub_info */
if (xchg(&sub_info->complete, NULL))
goto unlock;
+
+ /*
+ * fallthrough; in case of -ERESTARTSYS now do uninterruptible
+ * wait_for_completion_state(). Since umh_complete() shall call
+ * complete() in a moment if xchg() above returned NULL, this
+ * uninterruptible wait_for_completion_state() will not block
+ * SIGKILL'ed processes for long.
+ */
}
+ wait_for_completion_state(&done, state);

wait_done:
retval = sub_info->retval;