Re: [crash, bisected] Re: clocksource: Resolve cpu hotplug deadlock with TSC unstable
From: Martin Schwidefsky
Date: Fri Sep 11 2009 - 09:33:26 EST
On Fri, 11 Sep 2009 09:37:47 +0200
Ingo Molnar <mingo@xxxxxxx> wrote:
>
> * Ingo Molnar <mingo@xxxxxxx> wrote:
>
> >
> > * Ingo Molnar <mingo@xxxxxxx> wrote:
> >
> > >
> > > * Jens Axboe <jens.axboe@xxxxxxxxxx> wrote:
> > >
> > > > I went to try -tip btw, but it crashes on boot. Here's the
> > > > backtrace, typed manually, it's crashing in
> > > > queue_work_on+0x28/0x60.
> > > >
> > > > Call Trace:
> > > > queue_work
> > > > schedule_work
> > > > clocksource_mark_unstable
> > > > mark_tsc_unstable
> > > > check_tsc_sync_source
> > > > native_cpu_up
> > > > relay_hotcpu_callback
> > > > do_forK_idle
> > > > _cpu_up
> > > > cpu_up
> > > > kernel_init
> > > > kernel_thread_helper
> > >
> > > hm, that looks like an old bug i fixed days ago via:
> > >
> > > 00a3273: Revert "x86: Make tsc=reliable override boot time stability checks"
> > >
> > > Have you tested tip:master - do you still know which sha1?
> >
> > Ok, i reproduced it on a testbox and bisected it, the crash is
> > caused by:
> >
> > 7285dd7fd375763bfb8ab1ac9cf3f1206f503c16 is first bad commit
> > commit 7285dd7fd375763bfb8ab1ac9cf3f1206f503c16
> > Author: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > Date: Fri Aug 28 20:25:24 2009 +0200
> >
> > clocksource: Resolve cpu hotplug dead lock with TSC unstable
> >
> > Martin Schwidefsky analyzed it:
> >
> > I've reverted it in tip/master for now.
>
> and that uncovers the circular locking bug that this commit was
> supposed to fix ...
>
> Martin?
This patch should fix the obvious problem that the watchdog_work
structure is not yet initialized if the clocksource watchdog is not
running yet.
--
Subject: [PATCH] clocksource: statically initialize watchdog workqueue
From: Martin Schwidefsky <schwidefsky@xxxxxxxxxx>
The watchdog timer is started after the watchdog clocksource and at least
one watched clocksource have been registered. The clocksource work element
watchdog_work is initialized just before the clocksource timer is started.
This is too late for the clocksource_mark_unstable call from native_cpu_up.
To fix this use a static initializer for watchdog_work.
Signed-off-by: Martin Schwidefsky <schwidefsky@xxxxxxxxxx>
---
kernel/time/clocksource.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
Index: linux-2.6/kernel/time/clocksource.c
===================================================================
--- linux-2.6.orig/kernel/time/clocksource.c
+++ linux-2.6/kernel/time/clocksource.c
@@ -123,10 +123,12 @@ static DEFINE_MUTEX(clocksource_mutex);
static char override_name[32];
#ifdef CONFIG_CLOCKSOURCE_WATCHDOG
+static void clocksource_watchdog_work(struct work_struct *work);
+
static LIST_HEAD(watchdog_list);
static struct clocksource *watchdog;
static struct timer_list watchdog_timer;
-static struct work_struct watchdog_work;
+static DECLARE_WORK(watchdog_work, clocksource_watchdog_work);
static DEFINE_SPINLOCK(watchdog_lock);
static cycle_t watchdog_last;
static int watchdog_running;
@@ -230,7 +232,6 @@ static inline void clocksource_start_wat
{
if (watchdog_running || !watchdog || list_empty(&watchdog_list))
return;
- INIT_WORK(&watchdog_work, clocksource_watchdog_work);
init_timer(&watchdog_timer);
watchdog_timer.function = clocksource_watchdog;
watchdog_last = watchdog->read(watchdog);
--
blue skies,
Martin.
"Reality continues to ruin my life." - Calvin.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/