[tip:timers/urgent] clocksource: Prevent potential kgdb dead lock

From: tip-bot for Thomas Gleixner
Date: Tue Jan 26 2010 - 09:10:41 EST


Commit-ID: 7b7422a566aa0dc1e582ce263d4c7ff4a772700a
Gitweb: http://git.kernel.org/tip/7b7422a566aa0dc1e582ce263d4c7ff4a772700a
Author: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
AuthorDate: Tue, 26 Jan 2010 12:51:10 +0100
Committer: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
CommitDate: Tue, 26 Jan 2010 14:53:16 +0100

clocksource: Prevent potential kgdb dead lock

commit 0f8e8ef7 (clocksource: Simplify clocksource watchdog resume
logic) introduced a potential kgdb dead lock. When the kernel is
stopped by kgdb inside code which holds watchdog_lock then kgdb dead
locks in clocksource_resume_watchdog().

clocksource_resume_watchdog() is called from kbdg via
clocksource_touch_watchdog() to avoid that the clock source watchdog
marks TSC unstable after the kernel has been stopped.

Solve this by replacing spin_lock with a spin_trylock and just return
in case the lock is held. Not resetting the watchdog might result in
TSC becoming marked unstable, but that's an acceptable penalty for
using kgdb.

The timekeeping is anyway easily screwed up by kgdb when the system
uses either jiffies or a clock source which wraps in short intervals
(e.g. pm_timer wraps about every 4.6s), so we really do not have to
worry about that occasional TSC marked unstable side effect.

The second caller of clocksource_resume_watchdog() is
clocksource_resume(). The trylock is safe here as well because the
system is UP at this point, interrupts are disabled and nothing else
can hold watchdog_lock().

Reported-by: Jason Wessel <jason.wessel@xxxxxxxxxxxxx>
LKML-Reference: <1264480000-6997-4-git-send-email-jason.wessel@xxxxxxxxxxxxx>
Cc: kgdb-bugreport@xxxxxxxxxxxxxxxxxxxxx
Cc: Martin Schwidefsky <schwidefsky@xxxxxxxxxx>
Cc: John Stultz <johnstul@xxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
---
kernel/time/clocksource.c | 18 +++++++++++++++---
1 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index e85c234..1370083 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -343,7 +343,19 @@ static void clocksource_resume_watchdog(void)
{
unsigned long flags;

- spin_lock_irqsave(&watchdog_lock, flags);
+ /*
+ * We use trylock here to avoid a potential dead lock when
+ * kgdb calls this code after the kernel has been stopped with
+ * watchdog_lock held. When watchdog_lock is held we just
+ * return and accept, that the watchdog might trigger and mark
+ * the monitored clock source (usually TSC) unstable.
+ *
+ * This does not affect the other caller clocksource_resume()
+ * because at this point the kernel is UP, interrupts are
+ * disabled and nothing can hold watchdog_lock.
+ */
+ if (!spin_trylock_irqsave(&watchdog_lock, flags))
+ return;
clocksource_reset_watchdog();
spin_unlock_irqrestore(&watchdog_lock, flags);
}
@@ -458,8 +470,8 @@ void clocksource_resume(void)
* clocksource_touch_watchdog - Update watchdog
*
* Update the watchdog after exception contexts such as kgdb so as not
- * to incorrectly trip the watchdog.
- *
+ * to incorrectly trip the watchdog. This might fail when the kernel
+ * was stopped in code which holds watchdog_lock.
*/
void clocksource_touch_watchdog(void)
{
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/