Re: [PATCH] cpu hotplug, sched:Introduce cpu_active_map and redoscheddomainmanagment(take 2)
From: Max Krasnyansky
Date: Tue Jul 22 2008 - 15:33:17 EST
Gregory Haskins wrote:
Max Krasnyansky wrote:
Greg, correct me if I'm wrong but we seem to have exact same issue
with the rq->rq->online map. Lets take "cpu going down" for
example. We're clearing rq->rd->online bit on DYING event, but
nothing AFAICS prevents another cpu calling
rebuild_sched_domains()->partition_sched_domains() in the middle of
the hotplug sequence. partition_sched_domains() will happily reset
rd->rq->online mask and things will fail. I'm talking about this
path
__build_sched_domains() -> cpu_attach_domain() -> rq_attach_root()
...
cpu_set(rq->cpu, rd->span);
if (cpu_isset(rq->cpu, cpu_online_map))
set_rq_online(rq);
...
I think you are right, but wouldn't s/online/active above fix that as
well? The active_map didnt exist at the time that code went in
initially ;)
Actually after a bit more thinking :) I realized that the scenario I
explained above cannot happen because partition_sched_domains() must be
called under get_online_cpus() and the set_rq_online() happens in the
hotplug writer's path (ie under cpu_hotplug.lock). Since I unified all
the other domain rebuild paths (arch_reinit_sched_domains, etc) we
should be safe. But it again means we'd rely on those intricate
dependencies that we wanted to avoid with the cpu_active_map. Also
cpusets might still need to rebuild the domains in the hotplug writer's
path.
So it's better to fix it once and for all :)
--
btw Why didn't we convert sched*.c to use rq->rd->online when it was
introduced ? ie Instead of using cpu_online_map directly.
I think things were converted where they made sense to convert. But we
also had a different goal at that time, so perhaps something was
missed. If you think something else should be converted, please point
it out.
Ok. I'll keep an eye on it.
In the meantime, I would suggest we consider this patch on top of yours
(applies to tip/sched/devel):
----------------------
sched: Fully integrate cpus_active_map and root-domain code
Signed-off-by: Gregory Haskins <ghaskins@xxxxxxxxxx>
Ack.
The only thing I'm a bit unsure of is the error scenarios in the cpu
hotplug event sequence. online_map is not cleared when something in the
notifier chain fails, but active_map is.
Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/