Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state

From: Zhihao Cheng
Date: Mon Aug 03 2020 - 22:58:28 EST


在 2020/8/4 6:11, Richard Weinberger 写道:
On Mon, Aug 3, 2020 at 4:01 AM Zhihao Cheng <chengzhihao1@xxxxxxxxxx> wrote:
Hmm, I see the problem but I fear this patch does not cure the race completely.
It just lowers the chance to hit it.
What if KTHREAD_SHOULD_STOP is set right after you checked for it?
The patch can handle this case. ubi_thread will exit at
kthread_should_stop() in next iteration.
How can it reach the next iteration?
Maybe I didn't fully get your explanation.

As far as I understand the problem correctly, the following happens:
1. ubi_thread is running and the program counter is somewhere between
"if (kthread_should_stop())"
and schedule()
2. While detaching kthread_stop() is called
3. Since the program counter in the thread is right before schedule(),
it does not check KTHREAD_SHOULD_STOP
and blindly calls into schedule()
4. The thread goes to sleep and nothing wakes it anymore -> endless wait.

Is this correct so far?
Oh, you're thinking about influence by schedule(), I get it. But I think it still works. Because the ubi_thread is still on runqueue, it will be scheduled to execute later anyway.

op                                                    state of ubi_thread           on runqueue
set_current_state(TASK_INTERRUPTIBLE) TASK_INTERRUPTIBLE              Yes
if (kthread_should_stop()) // not satisfy TASK_INTERRUPTIBLE              Yes
kthread_stop:
  wake_up_process
    ttwu_queue
      ttwu_do_activate
        ttwu_do_wakeup TASK_RUNNING                       Yes
schedule
  __schedule(false)

 // prev->state is TASK_RUNNING, so we cannot move it from runqueue by deactivate_task(). So just pick next task to execute, ubi_thread is still on runqueue and will be scheduled to execute later.


The test patch added mdelay(5000) before schedule(), which can make sure kthread_stop()->wake_up_process() executed before schedule(). Previous analysis can be proved through test.

@@ -1638,6 +1641,15 @@ int ubi_thread(void *u)
                    !ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) {
                        set_current_state(TASK_INTERRUPTIBLE);
                        spin_unlock(&ubi->wl_lock);
+                       if (kthread_should_stop()) {
+                               set_current_state(TASK_RUNNING);
+                               break;
+                       }
+
+                       pr_err("Check should stop B\n");
+                       mdelay(5000);
+                       pr_err("delay 5000ms \n");
+
                        schedule();
                        continue;
                }


Your solution is putting another check for KTHREAD_SHOULD_STOP before
schedule().
I argue that this will just reduce the chance to hit the race window
because it can still happen
that kthread_stop() is being called right after the second check and
again before schedule().
Then we end up with the same situation.