Re: [PATCH] watchdog: Fix watchdog may detect false positive of softlockup

From: Luo Gengkun
Date: Wed Apr 16 2025 - 02:48:05 EST



On 2025/4/16 10:31, Andrew Morton wrote:
On Wed, 16 Apr 2025 01:39:22 +0000 Luo Gengkun <luogengkun@xxxxxxxxxxxxxxx> wrote:

The watchdog may dectect false positive of softlockup because of stop
softlockup after update watchdog_thresh. The problem can be described as
follow:

# We asuume previous watchdog_thresh is 60, so the timer is coming every
# 24s.
echo 10 > /proc/sys/kernel/watchdog_thresh (User space)
|
+------>+ update watchdog_thresh (We are in kernel now)
|
|
+------>+ watchdog hrtimer (irq context: detect softlockup)
|
|
+-------+
|
|
+ softlockup_stop_all

As showed above, there is a window between update watchdog_thresh and
softlockup_stop_all. During this window, if a timer is coming, a false
positive of softlockup will happen. To fix this problem, use a shadow
variable to store the new value and write back to watchdog_thresh after
softlockup_stop_all.

Changelog is a bit hard to follow. I asked gemini.google.com to clean
it up and it produced this:

: The watchdog may detect a false positive softlockup due to stopping the
: softlockup detection after updating `watchdog_thresh`. The problem can
: be described as follows:
:
: ```
: # Assume the previous watchdog_thresh is 60, so the timer triggers every 24 seconds.
: echo 10 > /proc/sys/kernel/watchdog_thresh (User space)
: |
: +------>+ Update watchdog_thresh (Kernel space)
: |
: |
: +------>+ Watchdog hrtimer (irq context: detect softlockup)
: |
: |
: +-------+
: |
: |
: + softlockup_stop_all
: ```
:
: As shown above, there is a window between updating `watchdog_thresh`
: and `softlockup_stop_all`. During this window, if a timer triggers, a
: false positive softlockup can occur. To fix this problem, a shadow
: variable should be used to store the new value, and this value should
: be written back to `watchdog_thresh` only after `softlockup_stop_all`
: has completed.

I don't know how accurate this is - please check&fix it and consider
incorporating the result?

--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -47,6 +47,7 @@ int __read_mostly watchdog_user_enabled = 1;
static int __read_mostly watchdog_hardlockup_user_enabled = WATCHDOG_HARDLOCKUP_DEFAULT;
static int __read_mostly watchdog_softlockup_user_enabled = 1;
int __read_mostly watchdog_thresh = 10;
+static int __read_mostly watchdog_thresh_shadow;
static int __read_mostly watchdog_hardlockup_available;
struct cpumask watchdog_cpumask __read_mostly;
@@ -876,6 +877,7 @@ static void __lockup_detector_reconfigure(void)
watchdog_hardlockup_stop();
softlockup_stop_all();
+ watchdog_thresh = READ_ONCE(watchdog_thresh_shadow);
I expect a reader of this code will wonder "what's all this
watchdog_thresh_shadow stuff". Can you please add a few small comments
to explain why we're doing this?

The following testcase may help to understand this problem.
---------------------------------------------
echo RT_RUNTIME_SHARE > /sys/kernel/debug/sched/features
echo -1 > /proc/sys/kernel/sched_rt_runtime_us
echo 0 > /sys/kernel/debug/sched/fair_server/cpu3/runtime
echo 60 > /proc/sys/kernel/watchdog_thresh
taskset -c 3 chrt -r 99 ./test &
echo 10 > /proc/sys/kernel/watchdog_thresh &
---------------------------------------------
This testcase first cancels the throttle rt task. Change the value of
watchdog_thresh to 60, and run a rt task on cpu3. The final command will
be blocked because the kworker:3, smp_call_on_cpu will put work on it,
cannot be selected by the scheduler due to the existence of real-time
threads. A softlockup will be detected on cpu3 after a while. The
reason is the watchdog_thresh is changed to 10, but the watchdog_timer_fn
still runs every 24 seconds, because the kworker:3
has not chance to stop previous hrtimer on cpu3.

The above case indicates that updating watchdog_thresh before doing
softlockup_stop_all is insecure, although there is not rt task on cpu3,
once the timer irq arrives, a false positive of softlockup is reported.

What this patch does, is first update the new thresh to
watchdog_thresh_shadow, which acts like a temporary variable, and write
back to watchdog_thresh after softlockup_stop_all is finished. By this way,
no more false positive case will be reported.