Re: [lhcs-devel] Re: CPU Hotplug broken -mm5 onwards

From: Nick Piggin
Date: Mon Apr 19 2004 - 17:56:22 EST

Srivatsa Vaddagiri wrote:
On Mon, Apr 19, 2004 at 01:34:14PM +1000, Nick Piggin wrote:

I think a rwsem might be a good idea anyway, because
sched_migrate_task can end up being called pretty often with
balance on exec and balance on clone. The semaphore could easily
place undue serialisation on that path.

I found that r/w sem does not help here ..It can still lead to deadlocks.
One example I hit is :

cpu_up takes write lock, sends out CPU_UP_PREPARE notification. As part
of it, many do kthread_create, which uses workqueue. The work function
is never processed because keventd would be blocked on a previous work function, waiting for hotplug sem in exec path.

So, as Rusty said, I think we really need to consider removing
lock_cpu_hotplug from sched_migrate_task. AFAICS that lock
was needed to prevent adding tasks to dead cpus. The same can be accomplished by removing lock_cpu_hotplug from sched_migrate_task
and adding a cpu_is_offline check in __migrate_task.
This will eliminate all the deadlocks I have been hitting.

Yes this would be a better idea. Care to send Andrew a patch
against -mm?

Can we arrange some of these checks to disappear when HOTPLUG_CPU
is not set? For example, make cpu_is_offline only valid to call for
CPUs that have been online sometime, and can evaluate to 0 if
HOTPLUG_CPU is not set?

I think this is already being done in include/linux/cpu.h

Yes I see. I didn't realise the first one was under an ifdef :P
