hotplug thread issues

From: Peter Zijlstra
Date: Thu Nov 06 2014 - 10:02:14 EST


Hi Thomas,

So there have been some reports on hitting:

BUG_ON(td->cpu != smp_processor_id());

in smpboot_thread_fn.

Now I've been staring at this for a wee bit today and I've found two
issues, but I'm not sure either are enough to explain the observed.

1) smpboot_register_percpu_thread() seems to lack serialization against
hotplug. It has a for_each_online() loop, but no get_online_cpus() --
unlike smpboot_unregister_percpu_thread, which does.

Typical usage like spawn_ksoftirqd() should be fine, they're early
init calls and those run before we bring up the other CPUs. Therefore
this does not explain the observation that its ksoftirqd/n triggering
the BUG.

However, the usage in proc_dowatchdog() is susceptible to this race
and its entirely possible to go wrong there.


2) the usage of __set_current_state(TASK_PARKED) in __kthread_parkme()
is wrong AFAICT, one should always use set_current_state() for
setting !TASK_RUNNING state. The comment with set_current_state()
explains why.

This would've allowed the test_bit(KTHREAD_SHOULD_PARK) load to have
been satisfied before the store of TASK_PARKED.


In any case, I'm not sure either of these are enough, I'll go stare at
it a bit more I suppose.

---
kernel/kthread.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 10e489c448fe..9787244d43ec 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -156,12 +156,12 @@ void *probe_kthread_data(struct task_struct *task)

static void __kthread_parkme(struct kthread *self)
{
- __set_current_state(TASK_PARKED);
+ set_current_state(TASK_PARKED);
while (test_bit(KTHREAD_SHOULD_PARK, &self->flags)) {
if (!test_and_set_bit(KTHREAD_IS_PARKED, &self->flags))
complete(&self->parked);
schedule();
- __set_current_state(TASK_PARKED);
+ set_current_state(TASK_PARKED);
}
clear_bit(KTHREAD_IS_PARKED, &self->flags);
__set_current_state(TASK_RUNNING);



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/