Re: [PATCH] watchdog: don't run proc_watchdog_update if new value is same as old

From: Ulrich Obergfell
Date: Fri Mar 18 2016 - 07:05:52 EST



Josh,

in https://lkml.org/lkml/2016/3/15/1 you stated that the soft lockup
messages do not occur with kernel v4.5. Hence, I believe this should
not be reproducible with kernel v4.4 either. The relevant changes in
update_watchdog_all_cpus() were introduced in kernel v4.3 by patches
that I mentioned in my previous reply. I see that

watchdog: use park/unpark functions in update_watchdog_all_cpus()

and related patches are included in this change log

https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/log/kernel/watchdog.c?id=refs/tags/v4.4.6

In terms of kernel v4.1, it seems that the issue is mitigated by the
access rights of the watchdog parameters in /proc/sys/kernel as only
a privileged user should be able to write to, for example

/proc/sys/kernel/nmi_watchdog

Also, based on the analysis in my previous reply, I think these soft
lockup messages are 'false positives' as the repeated cancel/restart
of watchdog_timer_fn() prevents the 'watchdog/N' thread from running
(i.e. I think the thread is not prevented from running by something
actually hogging CPU N).


Regards,

Uli


----- Original Message -----
From: "Josh Hunt" <johunt@xxxxxxxxxx>
To: "Ulrich Obergfell" <uobergfe@xxxxxxxxxx>
Cc: "Don Zickus" <dzickus@xxxxxxxxxx>, akpm@xxxxxxxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
Sent: Thursday, March 17, 2016 5:08:03 PM
Subject: Re: [PATCH] watchdog: don't run proc_watchdog_update if new value is same as old

[...]

As you mention my patch will mask this problem for 4.1 which is why I
wanted to get it into stable. Do you think there is any way to mitigate
this issue for the stable kernels (4.1 to 4.4) if the user changes the
values doing something like:

foo=1; while :; do echo $foo > /proc/sys/kernel/nmi_watchdog; foo=$(( !
$foo )); sleep .1; done & sleep 30 && kill %1

?

I realize this isn't a real-world use-case (I hope :)), but shows there
is still a way to cause the box to soft lockup with this code path.

[...]