[patch] clockevents_notify() need to be called with irq's enabled

From: Suresh Siddha
Date: Thu Aug 13 2009 - 18:52:25 EST


From: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>
Subject: clockevents_notify() need to be called with irq's enabled

Currently clockevents_notify() is called with interrupts enabled at some
places and interrupts disabled at some other places.

This results in a deadlock in this scenario.

cpu A holding the clockevents_lock in clockevents_notify() with irq enabled
cpu B waiting for the clockevents_lock in clockevents_notify() with irq disabled
cpu C doing set_mtrr() which will try to rendezvous of all the cpus.

This will result in C and A come to the rendezvous point and waiting for B.
B stuck forever waiting for the spinlock and thus not reaching rendezvous
point.

Fix the usage of clockevents_notify() so that it will always be called
with interrupts enabled and thus avoid the above deadlock.

This issue left us wondering if we need to change the MTRR rendezvous logic to
use stop machine logic (instead of smp_call_function) or add a check
in spinlock debug code to see if there are other spinlocks which gets
taken under both interrupts enabled/disabled conditions.

Signed-off-by: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@xxxxxxxxx>
---

Index: tip/arch/x86/kernel/process.c
===================================================================
--- tip.orig/arch/x86/kernel/process.c
+++ tip/arch/x86/kernel/process.c
@@ -520,17 +520,13 @@ static void c1e_idle(void)
cpu);
local_irq_disable();
}
+ local_irq_enable();
clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &cpu);
+ local_irq_disable();

default_idle();

- /*
- * The switch back from broadcast mode needs to be
- * called with interrupts disabled.
- */
- local_irq_disable();
clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &cpu);
- local_irq_enable();
} else
default_idle();
}
Index: tip/drivers/acpi/processor_idle.c
===================================================================
--- tip.orig/drivers/acpi/processor_idle.c
+++ tip/drivers/acpi/processor_idle.c
@@ -829,16 +829,18 @@ static int acpi_idle_enter_c1(struct cpu
if (unlikely(!pr))
return 0;

+ local_irq_enable();
+ lapic_timer_state_broadcast(pr, cx, 1);
local_irq_disable();

/* Do not access any ACPI IO ports in suspend path */
if (acpi_idle_suspend) {
local_irq_enable();
cpu_relax();
+ lapic_timer_state_broadcast(pr, cx, 0);
return 0;
}

- lapic_timer_state_broadcast(pr, cx, 1);
kt1 = ktime_get_real();
acpi_idle_do_entry(cx);
kt2 = ktime_get_real();
@@ -873,6 +875,13 @@ static int acpi_idle_enter_simple(struct
if (acpi_idle_suspend)
return(acpi_idle_enter_c1(dev, state));

+ local_irq_enable();
+ /*
+ * Must be done before busmaster disable as we might need to
+ * access HPET !
+ */
+ lapic_timer_state_broadcast(pr, cx, 1);
+
local_irq_disable();
current_thread_info()->status &= ~TS_POLLING;
/*
@@ -884,15 +893,10 @@ static int acpi_idle_enter_simple(struct
if (unlikely(need_resched())) {
current_thread_info()->status |= TS_POLLING;
local_irq_enable();
+ lapic_timer_state_broadcast(pr, cx, 0);
return 0;
}

- /*
- * Must be done before busmaster disable as we might need to
- * access HPET !
- */
- lapic_timer_state_broadcast(pr, cx, 1);
-
if (cx->type == ACPI_STATE_C3)
ACPI_FLUSH_CPU_CACHE();

@@ -957,6 +961,12 @@ static int acpi_idle_enter_bm(struct cpu
return 0;
}
}
+ local_irq_enable();
+ /*
+ * Must be done before busmaster disable as we might need to
+ * access HPET !
+ */
+ lapic_timer_state_broadcast(pr, cx, 1);

local_irq_disable();
current_thread_info()->status &= ~TS_POLLING;
@@ -969,6 +979,7 @@ static int acpi_idle_enter_bm(struct cpu
if (unlikely(need_resched())) {
current_thread_info()->status |= TS_POLLING;
local_irq_enable();
+ lapic_timer_state_broadcast(pr, cx, 0);
return 0;
}

@@ -976,11 +987,6 @@ static int acpi_idle_enter_bm(struct cpu

/* Tell the scheduler that we are going deep-idle: */
sched_clock_idle_sleep_event();
- /*
- * Must be done before busmaster disable as we might need to
- * access HPET !
- */
- lapic_timer_state_broadcast(pr, cx, 1);

kt1 = ktime_get_real();
/*


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/