Re: [PATCH] cpusets+hotplug+preepmt broken

From: Paul Jackson
Date: Fri May 13 2005 - 21:25:36 EST


Two days ago, I wrote:
> Another variation will be forthcoming soon.

Don't apply the following yet, Andrew. It is untested, and we've not
yet obtained agreement.

I'll sqawk again, if this patch survives long enough to warrant inclusion.

===

Ah to heck with it. This subtle distinction over what level of cpuset
we fall back to when a hot unplug leaves a task with no online cpuset in
its current allowed set is not worth it.

Every variation I consider is either sufficiently complicated that I
can't be sure it's right, or sufficiently simple that it's obviously
broken.

Revert the move_task_off_dead_cpu() code to its previous code, before
cpusets were added. If none of the remaining allowed cpus are online,
then let the task run on any cpu, no limit. This is a legal fallback,
and indeed one of the possible outcomes of the previous code. It's just
not so Nice.

If a system administrator doesn't like a task being allowed to run
anywhere as a result of this, then they should clear out a cpuset of the
tasks running in it, before they take the last cpu in that cpuset
offline, and they should use taskset (sched_setaffinity) or other means
to ensure that tasks aren't pinned to a cpu that is about to be taken
offline.

Unless and until someone can make a good case to the contrary, it is not
worth nesting hotplug and cpuset semaphores to attempt to provide a more
subtle fallback, that few people would understand anyway.

At least do one thing right - attach the task to the top_cpuset if we
have to force its cpus_allowed there. That keeps the tasks apparent
cpuset in sync with its cpus_allowed (any online cpu or CPU_MASK_ALL,
which are roughly equivalent in this context).

Signed-off-by: Paul Jackson <pj@xxxxxxx>

diff -Naurp 2.6.12-rc1-mm4.orig/kernel/sched.c 2.6.12-rc1-mm4/kernel/sched.c
--- 2.6.12-rc1-mm4.orig/kernel/sched.c 2005-05-13 18:39:54.000000000 -0700
+++ 2.6.12-rc1-mm4/kernel/sched.c 2005-05-13 19:02:49.000000000 -0700
@@ -4301,7 +4301,8 @@ static void move_task_off_dead_cpu(int d

/* No more Mr. Nice Guy. */
if (dest_cpu == NR_CPUS) {
- tsk->cpus_allowed = cpuset_cpus_allowed(tsk);
+ cpus_setall(tsk->cpus_allowed);
+ tsk->cpuset = &top_cpuset;
dest_cpu = any_online_cpu(tsk->cpus_allowed);

/*

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@xxxxxxxxxxxx> 1.650.933.1373, 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/