[PATCH RT 07/16] kernel/cpu: fix cpu down problem if kthreads cpu is going down

From: Steven Rostedt
Date: Mon Sep 09 2013 - 09:35:55 EST


From: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>

If kthread is pinned to CPUx and CPUx is going down then we get into
trouble:
- first the unplug thread is created
- it will set itself to hp->unplug. As a result, every task that is
going to take a lock, has to leave the CPU.
- the CPU_DOWN_PREPARE notifier are started. The worker thread will
start a new process for the "high priority worker".
Now kthread would like to take a lock but since it can't leave the CPU
it will never complete its task.

We could fire the unplug thread after the notifier but then the cpu is
no longer marked "online" and the unplug thread will run on CPU0 which
was fixed before :)

So instead the unplug thread is started and kept waiting until the
notfier complete their work.

Cc: stable-rt@xxxxxxxxxxxxxxx
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
---
kernel/cpu.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 5ef0c31..f56a8ec 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -78,6 +78,7 @@ struct hotplug_pcp {
int refcount;
int grab_lock;
struct completion synced;
+ struct completion unplug_wait;
#ifdef CONFIG_PREEMPT_RT_FULL
spinlock_t lock;
#else
@@ -175,6 +176,7 @@ static int sync_unplug_thread(void *data)
{
struct hotplug_pcp *hp = data;

+ wait_for_completion(&hp->unplug_wait);
preempt_disable();
hp->unplug = current;
wait_for_pinned_cpus(hp);
@@ -240,6 +242,14 @@ static void __cpu_unplug_sync(struct hotplug_pcp *hp)
wait_for_completion(&hp->synced);
}

+static void __cpu_unplug_wait(unsigned int cpu)
+{
+ struct hotplug_pcp *hp = &per_cpu(hotplug_pcp, cpu);
+
+ complete(&hp->unplug_wait);
+ wait_for_completion(&hp->synced);
+}
+
/*
* Start the sync_unplug_thread on the target cpu and wait for it to
* complete.
@@ -263,6 +273,7 @@ static int cpu_unplug_begin(unsigned int cpu)
tell_sched_cpu_down_begin(cpu);

init_completion(&hp->synced);
+ init_completion(&hp->unplug_wait);

hp->sync_tsk = kthread_create(sync_unplug_thread, hp, "sync_unplug/%d", cpu);
if (IS_ERR(hp->sync_tsk)) {
@@ -278,8 +289,7 @@ static int cpu_unplug_begin(unsigned int cpu)
* wait for tasks that are going to enter these sections and
* we must not have them block.
*/
- __cpu_unplug_sync(hp);
-
+ wake_up_process(hp->sync_tsk);
return 0;
}

@@ -545,6 +555,8 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)
goto out_release;
}

+ __cpu_unplug_wait(cpu);
+
/* Notifiers are done. Don't let any more tasks pin this CPU. */
cpu_unplug_sync(cpu);

--
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/