Re: CPU hotplug broken in 2.6.8-rc2 ?

From: Nathan Lynch
Date: Wed Aug 04 2004 - 01:07:22 EST


On Mon, 2004-08-02 at 15:26, Nathan Lynch wrote:
> On Mon, 2004-08-02 at 14:38, Nathan Lynch wrote:
> > Could you try on 2.6.8-rc2-mm2 along with this patch? Vatsa had a patch
> > go in that should prevent the crash you are seeing -- the patch below is
> > needed to prevent the same crash in the offline case. This check used
> > to be in load_balance and some other scheduler functions, iirc; does
> > anyone know why they were removed?
>
> Er, I meant to put the check in rebalance_tick, not load_balance.
>
> However, after a few minutes with this, I hit the BUG_ON in the CPU_DEAD
> case in migration_call; not sure whether this is a separate issue.

So, with the cpu_is_offline check in rebalance_tick on top of
2.6.8-rc2-mm2, this is the BUG_ON in migration_call I tend to hit while
hotplugging cpus as quickly as possible while running make -j 40:

case CPU_DEAD:
migrate_all_tasks(cpu);
rq = cpu_rq(cpu);
kthread_stop(rq->migration_thread);
rq->migration_thread = NULL;
/* Idle task back to normal (off runqueue, low prio) */
rq = task_rq_lock(rq->idle, &flags);
deactivate_task(rq->idle, rq);
rq->idle->static_prio = MAX_PRIO;
__setscheduler(rq->idle, SCHED_NORMAL, 0);
task_rq_unlock(rq, &flags);
BUG_ON(rq->nr_running != 0);

I can reproduce this on both ppc64 and i386. Does anyone know why this
is happening?

If I remove the BUG_ON, things seem to go ok, but I doubt that's the
right thing to do.

Nathan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/